Case study Β· Production MLOps
Fitness MLOps Platform
A live, production-shaped stack: inference on Kubernetes, model lifecycle with MLflow, RAG with persisted vectors, autoscaling under load, and dashboards you can read like a senior review, not a docker-compose demo.
π Production-readyπ Grafana liveβΈοΈ EKS + HPAπ MLflow registry
Architecture
Vercel β² Next.js 15 β EKS β‘ FastAPI β π XGBoost + Chroma RAG β π Grafana + Prometheus
β²
Next.js 15 (Vercel)
β‘
FastAPI (EKS)
π
XGBoost + RAG
π
Grafana / Prometheus
Production signals
HPA: 3 β 20 pods under load
MLflow model versions + registry workflow
Lambda cron retraining pipeline
StatefulSets for ChromaDB persistence
Dashboards: CPU, memory, predictions/sec
Repo layout (concept)
fitness-mlops/ βββ frontend/ # Next.js 15 βββ backend/ # FastAPI + XGBoost βββ k8s/ # Deployments, HPA, Services βββ mlflow/ # Model registry βββ lambda/ # Retraining cron βββ monitoring/ # Grafana + Prometheus βββ .github/workflows/ # CI β ECR
Grafana-style targets
βΈοΈ
Pods (HPA)
3 active β 20 max
π₯οΈ
CPU
~60% avg Β· 90% peak
β‘
Predictions/sec
15-45
π
MLflow models
3 versions
β
Uptime
99.8%