Case study Β· Production MLOps

Fitness MLOps Platform

A live, production-shaped stack: inference on Kubernetes, model lifecycle with MLflow, RAG with persisted vectors, autoscaling under load, and dashboards you can read like a senior review, not a docker-compose demo.

πŸš€ Production-readyπŸ“Š Grafana live☸️ EKS + HPAπŸ”„ MLflow registry

Architecture

Vercel β–² Next.js 15 β†’ EKS ⚑ FastAPI β†’ πŸ“ˆ XGBoost + Chroma RAG β†’ πŸ“Š Grafana + Prometheus

Production signals

HPA: 3 β†’ 20 pods under load

MLflow model versions + registry workflow

Lambda cron retraining pipeline

StatefulSets for ChromaDB persistence

Dashboards: CPU, memory, predictions/sec

Repo layout (concept)

fitness-mlops/
β”œβ”€β”€ frontend/     # Next.js 15
β”œβ”€β”€ backend/      # FastAPI + XGBoost
β”œβ”€β”€ k8s/          # Deployments, HPA, Services
β”œβ”€β”€ mlflow/       # Model registry
β”œβ”€β”€ lambda/       # Retraining cron
β”œβ”€β”€ monitoring/ # Grafana + Prometheus
└── .github/workflows/  # CI β†’ ECR

Grafana-style targets

☸️
Pods (HPA)
3 active β†’ 20 max
πŸ–₯️
CPU
~60% avg Β· 90% peak
⚑
Predictions/sec
15-45
πŸ”„
MLflow models
3 versions
βœ…
Uptime
99.8%

Want something similar in your product org?

Or reach me directly: