Chapter 5: Data Visualization & Storytelling
Data Visualization & Storytelling, explained like we’re chilling in a Hyderabad café (maybe near Hi-Tech City), screens open, me walking you through code cells in Jupyter while sipping filter coffee. I’ll teach this chapter the way I’d teach a friend who’s serious about landing data roles in 2026: hands-on, with real examples, why each tool fits certain jobs, and honest pros/cons based on current trends.
In 2026, visualization isn’t just “pretty charts”—it’s how you convince stakeholders, spot insights fast, and ship interactive apps. Companies want quick prototypes (Streamlit wins here) + production-grade dashboards (Plotly/Dash shine). Storytelling turns data into decisions—forget fancy 3D pies; focus on clarity and narrative.
1. Matplotlib & Seaborn Basics
Matplotlib — The grandfather. Low-level, full control, but verbose. Everything builds on it.
Seaborn — High-level wrapper on Matplotlib. Makes statistical plots beautiful with one line. Default themes look modern.
Install (if not already):
|
0 1 2 3 4 5 6 |
pip install matplotlib seaborn |
Basic Matplotlib example — Simple line plot for sales trend (imagine Hyderabad retail data)
|
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 |
import matplotlib.pyplot as plt import numpy as np months = ['Jan', 'Feb', 'Mar', 'Apr', 'May', 'Jun'] sales = [120000, 150000, 180000, 140000, 220000, 250000] plt.figure(figsize=(8, 5)) # size in inches plt.plot(months, sales, marker='o', linestyle='--', color='teal', linewidth=2) plt.title('Hyderabad Retail Sales Trend - 2025', fontsize=14, pad=15) plt.xlabel('Month', fontsize=12) plt.ylabel('Sales (₹)', fontsize=12) plt.grid(True, alpha=0.3) plt.tight_layout() plt.show() |
Seaborn upgrade — Same data, but prettier + statistical feel
|
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 |
import seaborn as sns import pandas as pd df_sales = pd.DataFrame({'Month': months, 'Sales': sales}) sns.set_style("whitegrid") # nice background plt.figure(figsize=(8, 5)) sns.lineplot(data=df_sales, x='Month', y='Sales', marker='o', color='purple') plt.title('Sales Trend with Seaborn Magic') plt.show() |
Common Seaborn plots (you’ll use these daily in EDA):
- sns.histplot() / sns.kdeplot() — distributions
- sns.boxplot() / sns.violinplot() — outliers & spread
- sns.countplot() — categorical counts
- sns.heatmap() — correlations
- sns.pairplot() — multivariate overview
- sns.scatterplot() + hue= — group by category
Example: Correlation heatmap (super common in ML feature selection)
|
0 1 2 3 4 5 6 7 8 9 10 |
# Assume df is your cleaned Pandas DataFrame with numeric columns corr = df.corr(numeric_only=True) sns.heatmap(corr, annot=True, cmap='coolwarm', fmt='.2f', linewidths=0.5) plt.title('Feature Correlation Heatmap') plt.show() |
Pro tip: Always start with Seaborn for EDA—it’s faster and looks professional without effort.
2. Advanced Visualization: Plotly & Altair
Plotly — Interactive king in 2026. Zoom, hover, export to HTML. Integrates perfectly with Dash/Streamlit.
Altair — Declarative (Vega-Lite based). Concise syntax, great for layered/exploratory viz, handles large data well via Vega.
Both beat Matplotlib/Seaborn for interactivity.
Plotly Express (easiest way — like Seaborn for interactive)
|
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 |
import plotly.express as px # Same sales data fig = px.line(df_sales, x='Month', y='Sales', title='Interactive Hyderabad Sales Trend', markers=True, template='plotly_dark') # modern look fig.update_layout(hovermode='x unified') # nice hover fig.show() # opens interactive plot |
Add hover data, animations, 3D scatter, maps (px.choropleth), etc.
Altair example — Declarative scatter with tooltips
|
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 |
import altair as alt chart = alt.Chart(df).mark_circle(size=60).encode( x='Age:Q', y='Salary:Q', color='City:N', tooltip=['Name', 'Age', 'Salary', 'City'] ).properties( title='Salary vs Age by City in Hyderabad Region', width=600, height=400 ).interactive() # zoom/pan chart |
When to choose:
- Plotly — Dashboards, production apps, maps, 3D, animations. Most popular in 2026 for interactive web viz.
- Altair — Quick exploratory in notebooks, layered/complex stats plots, large datasets (Vega handles aggregation efficiently).
3. Dashboarding: Streamlit / Dash (Simple Apps), Power BI / Tableau Intro
Streamlit — Fastest way to turn Python script → web app. In 2026, still #1 for quick DS prototypes/internal tools.
Dash (by Plotly) — More customizable, callback-based (React under hood). Better for complex/enterprise dashboards.
Streamlit simple app example (save as app.py, run streamlit run app.py)
|
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 |
import streamlit as st import pandas as pd import plotly.express as px st.title("Hyderabad House Price Explorer 2026") # Upload or sample data uploaded = st.file_uploader("Upload your CSV", type="csv") if uploaded: df = pd.read_csv(uploaded) else: # Sample data df = pd.DataFrame({ 'Area_sqft': [1200, 1500, 800, 2000, 950], 'Price_Lakhs': [85, 120, 55, 180, 68], 'Location': ['Gachibowli', 'HiTech City', 'Kukatpally', 'Banjara Hills', 'Madhapur'] }) fig = px.scatter(df, x='Area_sqft', y='Price_Lakhs', color='Location', size='Price_Lakhs', hover_name='Location', title='Price vs Area by Location') st.plotly_chart(fig, use_container_width=True) st.write("### Summary Stats") st.dataframe(df.describe()) |
Boom—interactive dashboard in <50 lines!
Dash — More control (callbacks for filters)
|
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 |
from dash import Dash, dcc, html, Input, Output import plotly.express as px app = Dash(__name__) app.layout = html.Div([ dcc.Graph(id='graph'), dcc.Dropdown(id='location', options=[...], value='All') ]) @app.callback(Output('graph', 'figure'), Input('location', 'value')) def update_graph(selected): # filter df, return fig return px.scatter(...) app.run_server(debug=True) |
Power BI / Tableau intro (no-code/low-code)
- Power BI: Microsoft ecosystem, free desktop version. Drag-drop, DAX for calcs, publish to Power BI Service.
- Tableau: Beautiful viz, strong storytelling. Tableau Public (free), Tableau Desktop (paid). Great for exec dashboards.
In India 2026: Many companies use Power BI (cheap + integrates with Azure/Excel). Learn basics—connect to SQL/Pandas export → build dashboard → share link.
Streamlit for Python-only teams/prototypes; Dash for complex; Power BI/Tableau for business users.
4. Principles of Good Data Storytelling
In 2026, storytelling = data + narrative + visuals to drive action. Not just charts—answer “So what?” and “Now what?”
Key principles (from experts like Brent Dykes, ThoughtSpot, Tableau best practices):
- Know your audience & goal — Exec? Quick insight + recommendation. Analyst? Deep details. Always ask: What decision does this support?
- Narrative structure (arc) — Problem → Insight → Implication/Action. Example: “Hyderabad housing prices rose 18% YoY (problem: affordability crisis) → driven by IT jobs in Gachibowli (insight) → recommend investing in suburbs like Kompally (action).”
- Progressive disclosure — Start high-level (one key chart), then drill down. Avoid info overload.
- Choose right chart — Match to question:
- Trend → Line
- Comparison → Bar/Column
- Distribution → Histogram/Box
- Correlation → Scatter/Heatmap
- Part-to-whole → Stacked bar/Pie (sparingly)
- Minimize chartjunk — Remove gridlines, excess labels, 3D effects. Use white space.
- Color wisely — 5–7 colors max. Use sequential for gradients, diverging for +/-, categorical for groups. Accessibility: colorblind-friendly palettes (viridis, ColorBrewer).
- Annotations & text — Titles that state insight (“Prices spiked 25% after new metro line”). Add arrows/callouts for key points.
- Gestalt principles — Proximity (group related), Similarity (same color = same group), Continuity (lines guide eye).
- Tell with data, not decorate — Every element serves story. Animations/transitions in 2026 help guide (e.g., Plotly animations for time series).
- Test & iterate — Show to colleague: “What do you see first? What action would you take?”
Real example: In a churn dashboard → Don’t just show “Churn 12%”. Story: “Churn jumped from 8% to 12% after price hike → highest among <30 age group → recommend targeted loyalty discount to retain ₹X crore.”
That’s Chapter 5 — the bridge from analysis to impact!
Practice: Build a Streamlit app with your EDA from Chapter 4 (e.g., house prices or Titanic survival). Add title, key charts, insights in markdown.
