Open Source ML Model Benchmarking Platform. Compare, evaluate, and benchmark machine learning models with Docker sandboxing, real-time leaderboards, and a powerful REST API.
A comprehensive platform for fair, reproducible, and secure model evaluation — from upload to leaderboard.
Benchmark models from PyTorch, TensorFlow, scikit-learn, XGBoost, LightGBM, and more. Upload your model and let the platform handle the rest.
Every benchmark runs in an isolated Docker container. No side effects, no resource leaks, and guaranteed reproducibility across runs.
Instantly compare models side-by-side with live-updating leaderboards. Sort by accuracy, F1, latency, throughput, or any custom metric.
Track accuracy, precision, recall, F1, AUC-ROC, inference time, memory usage, and more. Rich visualizations powered by Chart.js.
Full-featured API built on FastAPI with automatic OpenAPI docs. Integrate benchmarking into your CI/CD pipeline or custom tooling.
Celery-powered async task queue with Redis backend. Submit benchmarks and get notified when results are ready — no blocking.
Clone, configure, and launch your benchmarking platform with just a few commands.
Get the source code from GitHub and navigate into the project directory.
Copy the example .env file and set your database URL, Redis connection, and secret key.
Spin up the entire stack — API, worker, Redis, and database — with a single command.
Open the dashboard at localhost:8000, upload a model, select a dataset, and hit benchmark!
# Clone the repository git clone https://github.com/kartheekbvs/openbenchml.git cd openbenchml # Configure environment cp .env.example .env # Launch the full stack docker-compose up --build -d # Access the platform # Dashboard: http://localhost:8000 # API Docs: http://localhost:8000/docs
import requests # Submit a benchmark job response = requests.post( "http://localhost:8000/api/benchmark", json={ "model_id": "my-pytorch-model", "dataset_id": "mnist-test", "metrics": ["accuracy", "f1_score"] } ) job = response.json() print(f"Job submitted: {job['job_id']}")
Full-featured API with automatic documentation. Integrate benchmarking into any workflow.
A clean, scalable architecture designed for secure and reproducible ML benchmarking.
Leveraging the best of the Python and cloud-native ecosystem for reliability and performance.
Run OpenBenchML anywhere — from your laptop to the cloud.
The easiest way to get started. One command launches the full stack locally.
Deploy to Railway with one click. Auto-detects Dockerfile and provisions PostgreSQL & Redis.
Push to GitHub and let Render handle the rest. Managed DB and background workers included.
Start comparing ML models with confidence. Open source, community-driven, and built for reproducibility.