Chapter 12: Model Deployment & MLOps Basics (2025 must-have)

Model Deployment & MLOps Basics (2025–2026 must-have), explained like we’re wrapping up our long journey together in Airoli — it’s evening now (January 29, 2026, around 5:43 PM IST), the street lights are on outside, your laptop fan is humming, and we’re finally moving from notebooks to real-world impact. This is the chapter that turns your churn model (or any project) from a cool Jupyter experiment into something stakeholders can actually use 24/7.

In 2026 India (especially Mumbai/Navi Mumbai/Hyderabad fintech, e-commerce, telecom), companies expect juniors to know basics of deployment + monitoring — not just train models. Pure notebook work is entry-level; production thinking (drift, APIs, cost, reproducibility) gets you mid-level interviews and better pay. MLOps is no longer optional — it’s table stakes.

We’ll use our Telco Churn XGBoost/RF model from earlier chapters as the running example.

1. Saving & Loading Models (joblib, pickle)

Models are just Python objects — save them so you can load later (inference, deployment).

joblib — preferred for scikit-learn/XGBoost/LightGBM (faster, better with large NumPy arrays).

Python

pickle — built-in, works everywhere, but slower/less secure for large models.

Python

2026 tip: Use joblib for scikit-learn family, torch.save / safetensors for PyTorch/LLMs (safer, faster). Add requirements.txt (pip freeze > requirements.txt) and model card (README with metrics, usage).

2. Flask / FastAPI for Simple APIs

Serve predictions via HTTP API — frontend/mobile calls it.

Flask — lightweight, quick for MVPs (still used in 2026 for simple stuff).

Python

Run: python app.py → test with Postman/cURL: POST http://localhost:5000/predict

FastAPI (2026 winner for ML serving — async, auto Swagger docs, type hints, 3–10x faster than Flask for concurrency).

Python

Auto docs at http://localhost:8000/docs — huge win for teams. FastAPI vs Flask 2026: FastAPI for production/ML APIs (async, Pydantic validation, OpenAPI). Flask for quick prototypes or when you need Jinja templates.

3. Streamlit / Gradio Demos

Quick interactive UIs — no frontend needed.

Streamlit (Python-only, super fast for DS demos).

Python

Run: streamlit run app.py → shareable link.

Gradio — similar, great for Hugging Face Spaces (share ML demos publicly).

Python

Choice: Streamlit for full apps/dashboards; Gradio for quick model demos (esp. NLP/CV).

4. Docker Basics

Docker = packages app + dependencies → runs identically everywhere (laptop → server → cloud).

Why for DS in 2026? Reproducibility (no “works on my machine”), easy scaling, cloud deployment.

Step-by-step basics:

  1. Install Docker Desktop (Windows/Mac) or Docker Engine (Linux).
  2. Create Dockerfile in project root:
dockerfile
  1. requirements.txt:
text
  1. Build image:
Bash
  1. Run container:
Bash

Access http://localhost:8000/docs

Pro tip: Use multi-stage builds for smaller images; add .dockerignore (ignore data/, notebooks/).

5. MLflow or Weights & Biases Intro

MLflow (open-source, free, Databricks-backed) — tracks experiments, logs models, registry, serving.

Quickstart:

Python

UI: mlflow ui → localhost:5000 — compare runs, register model.

Weights & Biases (W&B) — cloud-first, beautiful UI, collaboration, sweeps (hyperparam tuning).

Python

Free tier generous; team love it for sweeps + reports.

Choice 2026: MLflow if open-source/self-hosted; W&B if you want zero-setup UI + collaboration.

6. Cloud Platforms Overview (AWS SageMaker, GCP Vertex AI, Azure ML)

AWS SageMaker (market leader ~34% 2025–2026) — full MLOps, notebooks, training, endpoints, monitoring. Strong in custom algos, Inferentia chips for cheap inference.

GCP Vertex AI (~22%) — intuitive UI, AutoML strong, TPU for fast training, integrates BigQuery. Great for data-heavy/NLP.

Azure ML (~29%) — best if Microsoft stack (Teams, Power BI, Purview governance). Confidential computing, strong regulated industries.

Quick comparison (2026 vibes):

Aspect AWS SageMaker GCP Vertex AI Azure ML
Market Share (2025) ~34% ~22% ~29%
Best For Scale, custom, AWS ecosystem AutoML, data integration, UI Microsoft shops, compliance
Training Speed GPU/Trainium good TPU fastest for some Good, but less specialized
Cost Savings Plans up to 64% off Pay-per-use, can be high data Complex, but enterprise deals
Ease for Beginners Steep (many services) Most intuitive Good notebooks
GenAI/LLM Bedrock + SageMaker Gemini + Vertex OpenAI + Azure AI

For you in Airoli 2026: Start with free tier (GCP Vertex or Azure) — easy notebooks. If job at AWS-heavy company → SageMaker. All have free credits.

Final Project Tip: Dockerize your FastAPI churn API → push to Docker Hub → deploy to Render/Fly.io (free tier) or cloud (SageMaker endpoint). Track with MLflow/W&B. Add Streamlit frontend. Boom — production portfolio piece!

That’s Chapter 12 — you’re now job-ready for deployment!

You may also like...

Leave a Reply

Your email address will not be published. Required fields are marked *