Chapter 9: Cleaning Empty Cells

How to clean / handle empty cells (missing values) in pandas.

Imagine we are sitting together and I’m showing you real data on my screen. We will go slowly, understand why things are missing, see what options we really have, and learn the patterns most people actually use in real projects.

1. First — what do we actually mean by “empty cells”?

In pandas, “empty” usually means one of these:

What you see	Internal value in pandas	Name	isna() returns True?
(nothing)	NaN	Not a Number (float)	Yes
None	None	Python None	Yes
<NA>	pd.NA	Nullable integer/string	Yes
empty string “”	“”	Empty string	No
“NA”, “N/A”, “-“, “null”, “missing”	string	Text that means missing	No

Important takeaway: NaN, None, and pd.NA are considered missing by pandas (isna() / isnull()). Empty strings and words like “NA” / “-” are not missing — they are normal strings.

2. Let’s create a realistic messy table with different kinds of “empties”

Python

3. Step 1 – Always start by finding out where the missing values are

Python

Common output for our table:

text

4. The 7 most realistic ways people handle missing values

#	Method	When people use it	Code example	Destructive?
1	Drop rows	Very few missing, row is useless without data	df.dropna()	Yes
2	Drop columns	Column is almost all missing	df.dropna(axis=1, thresh=…)	Yes
3	Fill with fixed value	You know what missing should mean (0, ‘Unknown’)	df[‘city’] = df[‘city’].fillna(‘Unknown’)	No
4	Fill with mean / median	Numeric column, missing looks random	df[‘salary’] = df[‘salary’].fillna(df[‘salary’].median())	No
5	Fill with group average	Missing depends on category (dept, city, …)	Groupby + transform	No
6	Forward / backward fill	Time series, last known value is reasonable	df[‘rating’].fillna(method=’ffill’)	No
7	Leave as is / mark explicitly	You want to keep info that value was missing	df[‘salary_missing’] = df[‘salary’].isna()	No

5. Realistic cleaning walkthrough — column by column

Column: name

Python

Column: age

Python

Column: city

Python

Column: salary (very common situation!)

Python

Column: rating

Python

Column: joined (date)

Python

Column: active

Python

6. Quick reference – the commands people use most

Python

7. Mini practice task for you

Take this small messy series:

Python

Clean it so that:

empty string ” → NaN
‘N/A’ → NaN
‘-‘ → NaN
All missing values filled with median of the non-missing numbers

Try writing the code — then come back and we’ll compare.

Where do you want to go next?

More advanced group-based imputation (KNN, regression, etc.)
How to deal with very high percentage of missing values
Visualizing missing values (missingno library)
Cleaning mixed types in the same column
Realistic strategies when you don’t know what to fill

Just tell me which direction you want to continue — I’ll keep explaining slowly with examples. 😊

Languages

Database

Web Technologies

Wordpress Tutorial

Top Online Compilers

PHP Projects

CRUD Management
PHP Search
Blog/CMS
E-commerce Website
Event Management System
Online Learning Platform
Task Management System
Social Networking Site
Inventory Management System
Real Estate Listing Website
Job Portal
Discussion Forum
Online Quiz/Test Platform
File Sharing System
Travel Booking System
Expense Management System
Recipe Sharing Platform
Online Survey System
Library Management System
Health and Fitness Tracker
Online Marketplace

Home

About Us

Disclaimer

+91 9433 511 250

Email

info@bestwebteacher.com

Chapter 9: Cleaning Empty Cells

1. First — what do we actually mean by “empty cells”?

2. Let’s create a realistic messy table with different kinds of “empties”

3. Step 1 – Always start by finding out where the missing values are

4. The 7 most realistic ways people handle missing values