Chapter 15: Plotting

plotting — written as if we are sitting together in front of a screen, I’m showing every line of code, explaining why we do things this way, what common mistakes people make, and how real people actually create useful plots in data analysis in 2025–2026.

Let’s go slowly and realistically.

Step 0 – Mindset before we start plotting

Good plots are not about beauty first — they are about answering a question clearly.

Before writing any .plot() code, always ask yourself:

  • What question am I trying to answer?
  • Who is looking at this plot? (me / team / boss / presentation / report)
  • What is the most important message I want to jump out?

Common goals:

  • See trend over time → line plot
  • Compare categories → bar plot
  • See relationship between two numbers → scatter plot
  • See distribution → histogram / boxplot
  • See proportions → pie (carefully!) or stacked bar

Step 1 – Prepare a realistic dataset

We’ll use a small but realistic sales / student performance dataset.

Python

Step 2 – The absolute simplest plot in pandas

Python

What we see: A messy line because dates are not sorted properly and there is no meaningful order yet.

Step 3 – Most useful first real plot: Time series line plot

Python

Step 4 – Grouped line plot – compare regions

Python

Step 5 – Bar plot – compare categories

Python

Step 6 – Scatter plot – relationship between two variables

Python

Step 7 – Histogram & KDE – distribution of one variable

Python

Step 8 – Boxplot – compare distributions across categories

Python

Step 9 – Quick reference – most common plot types in pandas

Goal Code example Best when…
Time series trend df[‘col’].plot() Data has datetime index
Compare categories df.groupby(‘cat’)[‘val’].sum().plot.bar() Few categories (≤ 10–12)
Relationship 2 variables df.plot.scatter(x=’a’, y=’b’) Looking for correlation/pattern
Distribution / shape df[‘col’].plot.hist(bins=20) Understand spread & shape
Compare distributions sns.boxplot(x=’cat’, y=’num’, data=df) Many groups, want median/outliers
Correlation heatmap sns.heatmap(df.corr(), annot=True, cmap=’coolwarm’) Many numeric variables

Step 10 – Your turn – small practice tasks

Try these on the sales DataFrame:

  1. Plot monthly total revenue (hint: resample(‘ME’))
  2. Create a bar plot showing average customer rating by product
  3. Make a scatter plot of units_sold vs revenue, colored by discount_%
  4. Show boxplots of revenue by region

Which one would you like to try first? Or tell me what kind of plot you want to create with your own data — I’ll guide you step by step.

Where do you want to go next?

  • Styling plots better (titles, legends, colors, themes)
  • Subplots – multiple plots in one figure
  • Saving plots (png, pdf, high resolution)
  • Plotly or Seaborn advanced plots
  • Common mistakes & how to avoid ugly plots

Just say the word — we’ll continue slowly and practically. 😊

You may also like...

Leave a Reply

Your email address will not be published. Required fields are marked *