Case study · Production MLOps

Fitness MLOps Platform

A live, production-shaped stack: inference on Kubernetes, model lifecycle with MLflow, RAG with persisted vectors, autoscaling under load, and dashboards you can read like a senior review, not a docker-compose demo.

🚀 Production-ready📊 Grafana live☸️ EKS + HPA🔄 MLflow registry

Architecture

Vercel ▲ Next.js 15 → EKS ⚡ FastAPI → 📈 XGBoost + Chroma RAG → 📊 Grafana + Prometheus

▲

Next.js 15 (Vercel)

⚡

FastAPI (EKS)

📈

XGBoost + RAG

📊

Grafana / Prometheus

Production signals

HPA: 3 → 20 pods under load

MLflow model versions + registry workflow

Lambda cron retraining pipeline

StatefulSets for ChromaDB persistence

Dashboards: CPU, memory, predictions/sec

Repo layout (concept)

fitness-mlops/
├── frontend/     # Next.js 15
├── backend/      # FastAPI + XGBoost
├── k8s/          # Deployments, HPA, Services
├── mlflow/       # Model registry
├── lambda/       # Retraining cron
├── monitoring/ # Grafana + Prometheus
└── .github/workflows/  # CI → ECR

Grafana-style targets

☸️

Pods (HPA)

3 active → 20 max

🖥️

CPU

~60% avg · 90% peak

⚡

Predictions/sec

15-45

🔄

MLflow models

3 versions

✅

Uptime

99.8%

Want something similar in your product org?

Or reach me directly:

dimitar@petrov.build LinkedIn

More projects in the portfolio →