Chapter 2: Programming Foundations

Programming Foundations explained in full detail, just like we’re sitting together in Airoli with your laptop open, me guiding you line by line, chai in hand. I’ll talk like your patient teacher who’s been coding in Python for data science since before Pandas was cool. We’ll go slow, with tons of real examples (copy-paste ready), why each thing matters for data science in 2026, and small exercises you can try right now.

By the end of this chapter, you should be comfortable writing clean Python scripts, organizing code, handling mistakes gracefully, and using Git/GitHub to track your work like a pro. This is the foundation—everything in later chapters (Pandas, ML models, deployment) builds on this.

Why Python is Still the King for Data Science in 2026

Python dominates because:

  • Readable like English → you spend less time debugging syntax, more on thinking about data.
  • Huge ecosystem: NumPy, Pandas, Scikit-learn, PyTorch, Hugging Face — all mature.
  • Community + jobs: 90%+ of data science/ML roles in India ask for Python (LinkedIn/Naukri trends 2026).
  • In 2026, Python 3.13 is stable → better REPL (interactive shell), improved error messages, experimental free-threading (faster for some parallel work), but we’ll stick to 3.11–3.13 basics that are rock-solid for DS.

Setup tip (do this now if not done): Install Python 3.12 or 3.13 from python.org or use Anaconda/Miniconda (best for data science → comes with Pandas, Jupyter, etc.). Use VS Code + Python extension or JupyterLab — both free and excellent.

1. Variables, Data Types, Control Flow, Functions

Variables — boxes to store stuff. No need to declare type.

Python

Common data types in data science:

  • int, float — numbers
  • str — text
  • bool — True/False
  • list — ordered, changeable collection [1, 2, 3]
  • tuple — ordered, unchangeable (1, 2, 3)
  • dict — key-value pairs {“city”: “Airoli”, “pin”: 400708}
  • set — unique items {1, 2, 3} (no duplicates)

Control flow — decisions and loops

Python

Functions — reusable blocks of code. Crucial for clean DS scripts.

Python

Quick exercise: Write a function is_hot_day(temp_c) that returns “Hot” if >30, “Warm” if 25–30, “Cool” otherwise. Test it.

2. OOP Basics, Modules, Error Handling

OOP (Object-Oriented Programming) — think of real-world things as objects with data + behavior.

Python

In data science: You’ll see classes in scikit-learn (e.g., model = RandomForestClassifier()), PyTorch models, custom pipelines.

Modules — files with code you can import (reuse across projects)

Create utils.py:

Python

Then in your main script:

Python

Popular built-in modules: math, random, datetime, os.

Error Handling — code crashes less (very important in DS pipelines)

Python

In DS: Handle missing files, bad data, API errors gracefully.

3. File I/O, List/Dict Comprehensions, Lambda

File I/O — read/write files (CSV, JSON common in DS)

Python

List comprehensions — short, fast way to create lists (very Pythonic for DS)

Python

Dict comprehensions

Python

Lambda — anonymous (one-line) functions. Super useful in sorting, Pandas apply, etc.

Python

4. Version Control — Git & GitHub (Branching, Pull Requests, Collaboration)

Why Git for data scientists in 2026? Notebooks change a lot → track experiments, revert bad changes, collaborate, show portfolio on GitHub.

Step-by-step setup (do this today):

  1. Install Git → https://git-scm.com
  2. Create free GitHub account
  3. Configure:
    Bash

Basic workflow (local → GitHub):

Bash

Branching (super important for experiments)

Bash

Pull Request (PR) on GitHub — for collaboration/review

  1. Push branch → git push origin feature/churn-model
  2. Go to GitHub → repo → Pull requests → New pull request
  3. Compare branches → create PR → add description (“Added XGBoost churn model, 0.82 AUC”)
  4. Someone reviews → approve → merge

Collaboration example: Friend clones your repo

Bash

Pro tips for DS in 2026:

  • Add .gitignore (ignore large data files, pycache, .env)
  • Use Git LFS for big models/datasets
  • Commit often, small messages
  • Use GitHub Actions for auto-testing notebooks (advanced but cool)

Quick exercise: Create a folder python-practice, init git, make a branch experiment-1, write a small script, commit, push to GitHub, create PR from branch to main.

That’s the full Chapter 2! You now have the tools to write structured, reusable, error-proof Python code and version it properly.

You may also like...

Leave a Reply

Your email address will not be published. Required fields are marked *