Chapter 13: Correlations

What is Correlation? (very simple first explanation)

Correlation measures how two variables move together.

If both go up together → positive correlation
If one goes up and the other goes down → negative correlation
If they have no pattern → correlation close to 0

The most common number people use is Pearson correlation coefficient (r), which ranges from -1 to +1:

Value	Meaning	Real-life example
+1.0	Perfect positive correlation	Height in cm and height in inches
+0.8 to +0.9	Strong positive correlation	Study hours and exam marks
+0.3 to +0.7	Moderate positive correlation	House size and house price
0 to +0.3	Weak / almost no positive correlation	Shoe size and monthly phone bill
~0	No linear relationship	Temperature and favorite color
-0.3 to -0.7	Moderate negative correlation	Price of a product and number of units sold
-0.8 to -0.9	Strong negative correlation	Speed of car and time taken to reach destination
-1.0	Perfect negative correlation	Amount of fuel left and distance already driven

Let’s create a realistic dataset to play with

Python

Step 1 – The easiest way to see all correlations

Python

Typical output (values will vary slightly because of random noise):

	study_hours_per_week	sleep_hours_per_day	attendance_percent	online_video_hours	part_time_job_hours	math_marks	science_marks	english_marks	total_marks
study_hours_per_week	1.000	-0.042	0.118	-0.075	-0.089	0.892	0.841	0.712	0.868
sleep_hours_per_day	-0.042	1.000	-0.031	0.102	0.065	-0.028	-0.041	0.019	-0.022
attendance_percent	0.118	-0.031	1.000	-0.045	-0.134	0.689	0.734	0.601	0.702
online_video_hours	-0.075	0.102	-0.045	1.000	0.078	0.112	0.089	0.065	0.098
part_time_job_hours	-0.089	0.065	-0.134	0.078	1.000	-0.621	-0.589	-0.412	-0.578
math_marks	0.892	-0.028	0.689	0.112	-0.621	1.000	0.912	0.789	0.958
science_marks	0.841	-0.041	0.734	0.089	-0.589	0.912	1.000	0.821	0.942
english_marks	0.712	0.019	0.601	0.065	-0.412	0.789	0.821	1.000	0.905
total_marks	0.868	-0.022	0.702	0.098	-0.578	0.958	0.942	0.905	1.000

Step 2 – Understanding what we see

Strong positive correlations (0.8+):

study_hours_per_week → math_marks (0.892)
study_hours_per_week → science_marks (0.841)
math_marks ↔ science_marks (0.912)
total_marks is strongly related to all subject marks (obvious)

Moderate positive correlations:

attendance_percent → all marks (~0.6 to 0.73)

Strong negative correlations:

part_time_job_hours → math_marks (-0.621)
part_time_job_hours → total_marks (-0.578)

Almost no correlation:

sleep_hours_per_day → almost everything (~0)
online_video_hours → marks (~0.1)

Step 3 – Best ways to visualize correlations

Python

Alternative: smaller focused view (only important columns)

Python

Step 4 – Most common correlation methods in pandas

Python

When to choose which?

Method	Best for	Sensitive to outliers?	Assumes linear?
Pearson	Continuous data, linear relationships	Yes	Yes
Spearman	Ordinal data, non-linear but monotonic	Less	No
Kendall	Small samples, ordinal data	Less	No

Step 5 – Quick practical questions you can answer with correlation

Python

Step 6 – Important warnings & common mistakes

Correlation ≠ Causation → High correlation does not mean one causes the other → Example: Ice cream sales and drowning deaths are positively correlated — both go up in summer (third variable: temperature)
Outliers can destroy correlation Try this experiment:

Python

Non-linear relationships are missed by Pearson (example: quadratic relationship, U-shape)
Too many columns = messy heatmap → Always select only meaningful columns first

Your turn — small homework exercises

Calculate correlation between part_time_job_hours and all marks
Find the two variables that have the strongest negative correlation
Create a correlation heatmap only for columns related to marks and study factors
Add a new column revision_hours = study_hours_per_week * 0.6 + random noise → Check how strongly it correlates with marks

Try these — and feel free to share your code/output or ask what went wrong.

Where do you want to go next?

Scatter plots to visually understand correlations
How to test significance of correlation (p-value)
Partial correlation (control for third variable)
Correlation vs Causation real-life examples
Dealing with missing values when calculating correlation
When correlation is misleading (classic traps)

Just tell me which direction you want to explore next — we’ll go slowly and deeply with examples. 😊

Languages

Database

Web Technologies

Wordpress Tutorial

Top Online Compilers

PHP Projects

CRUD Management
PHP Search
Blog/CMS
E-commerce Website
Event Management System
Online Learning Platform
Task Management System
Social Networking Site
Inventory Management System
Real Estate Listing Website
Job Portal
Discussion Forum
Online Quiz/Test Platform
File Sharing System
Travel Booking System
Expense Management System
Recipe Sharing Platform
Online Survey System
Library Management System
Health and Fitness Tracker
Online Marketplace

Home

About Us

Disclaimer

+91 9433 511 250

Email

info@bestwebteacher.com

Chapter 13: Correlations

What is Correlation? (very simple first explanation)

Let’s create a realistic dataset to play with

Step 1 – The easiest way to see all correlations

Step 2 – Understanding what we see

Step 3 – Best ways to visualize correlations

Step 4 – Most common correlation methods in pandas

Step 5 – Quick practical questions you can answer with correlation

Step 6 – Important warnings & common mistakes

Your turn — small homework exercises

You may also like...

Leave a Reply Cancel reply

PANDAS Tutorial

Languages

Database

Web Technologies

Web Technologies

Wordpress Tutorial

Top Online Compilers

PHP Projects

WhatsApp

Email

Connect with us

Chapter 13: Correlations

What is Correlation? (very simple first explanation)

Let’s create a realistic dataset to play with

Step 1 – The easiest way to see all correlations

Step 2 – Understanding what we see

Step 3 – Best ways to visualize correlations

Step 4 – Most common correlation methods in pandas

Step 5 – Quick practical questions you can answer with correlation

Step 6 – Important warnings & common mistakes

Your turn — small homework exercises

You may also like...

Chapter 22: Pandas Study Plan

Chapter 21: Pandas Syllabus

Chapter 20: Pandas Exercises

Leave a Reply Cancel reply

PANDAS Tutorial

Languages

Database

Web Technologies

Web Technologies

Wordpress Tutorial

Top Online Compilers

PHP Projects

WhatsApp

Email

Connect with us