Open Source · MIT License

OpenBenchML

Open Source ML Model Benchmarking Platform. Compare, evaluate, and benchmark machine learning models with Docker sandboxing, real-time leaderboards, and a powerful REST API.

6+
ML Frameworks
Docker
Sandboxed Runs
REST
API First
MIT
License

Everything You Need to Benchmark ML Models

A comprehensive platform for fair, reproducible, and secure model evaluation — from upload to leaderboard.

Multi-Framework Support

Benchmark models from PyTorch, TensorFlow, scikit-learn, XGBoost, LightGBM, and more. Upload your model and let the platform handle the rest.

Docker Sandbox

Every benchmark runs in an isolated Docker container. No side effects, no resource leaks, and guaranteed reproducibility across runs.

Real-time Leaderboards

Instantly compare models side-by-side with live-updating leaderboards. Sort by accuracy, F1, latency, throughput, or any custom metric.

Performance Metrics

Track accuracy, precision, recall, F1, AUC-ROC, inference time, memory usage, and more. Rich visualizations powered by Chart.js.

REST API

Full-featured API built on FastAPI with automatic OpenAPI docs. Integrate benchmarking into your CI/CD pipeline or custom tooling.

Async Processing

Celery-powered async task queue with Redis backend. Submit benchmarks and get notified when results are ready — no blocking.

Up and Running in Minutes

Clone, configure, and launch your benchmarking platform with just a few commands.

1

Clone the Repository

Get the source code from GitHub and navigate into the project directory.

2

Configure Environment

Copy the example .env file and set your database URL, Redis connection, and secret key.

3

Launch with Docker Compose

Spin up the entire stack — API, worker, Redis, and database — with a single command.

4

Start Benchmarking

Open the dashboard at localhost:8000, upload a model, select a dataset, and hit benchmark!

Terminal
# Clone the repository
git clone https://github.com/kartheekbvs/openbenchml.git
cd openbenchml

# Configure environment
cp .env.example .env

# Launch the full stack
docker-compose up --build -d

# Access the platform
# Dashboard: http://localhost:8000
# API Docs:  http://localhost:8000/docs
Python SDK Example
import requests

# Submit a benchmark job
response = requests.post(
    "http://localhost:8000/api/benchmark",
    json={
        "model_id": "my-pytorch-model",
        "dataset_id": "mnist-test",
        "metrics": ["accuracy", "f1_score"]
    }
)

job = response.json()
print(f"Job submitted: {job['job_id']}")

Powerful REST API

Full-featured API with automatic documentation. Integrate benchmarking into any workflow.

GET /api/models List all uploaded models
{ "models": [ { "id": "resnet50-v1", "name": "ResNet-50", "framework": "pytorch", "uploaded_at": "2025-01-15T10:30:00Z", "status": "ready" } ], "total": 1, "page": 1 }
POST /api/benchmark Submit a benchmark job
{ "job_id": "bench-a7f3c2e1", "model_id": "resnet50-v1", "dataset_id": "imagenet-val", "status": "queued", "metrics": ["accuracy", "f1_score", "latency"], "estimated_time": "2-5 min" }
GET /api/leaderboard Get ranked model results
{ "leaderboard": [ { "rank": 1, "model": "ResNet-50", "accuracy": 0.963, "f1_score": 0.958, "latency_ms": 12.4 } ], "dataset": "imagenet-val", "sort_by": "accuracy" }
GET /api/jobs/{job_id} Check benchmark job status
{ "job_id": "bench-a7f3c2e1", "status": "completed", "progress": 100, "results": { "accuracy": 0.963, "f1_score": 0.958, "latency_ms": 12.4, "memory_mb": 245 }, "completed_at": "2025-01-15T10:35:22Z" }
DELETE /api/models/{model_id} Remove a model and its results
{ "message": "Model deleted successfully", "model_id": "resnet50-v1", "deleted_jobs": 3 }

How It Works

A clean, scalable architecture designed for secure and reproducible ML benchmarking.

Client Layer
Web UI
Dashboard
Chart.js Visualizations
API Client
REST API
OpenAPI / Swagger
Application Layer
Framework
FastAPI
Routes & Auth
ORM
SQLAlchemy
Models & Migrations
Tasks
Celery
Async Job Queue
Infrastructure Layer
Database
PostgreSQL
Models & Results
Broker
Redis
Queue & Cache
Runtime
Docker
Sandboxed Benchmarks

Built With Proven Technologies

Leveraging the best of the Python and cloud-native ecosystem for reliability and performance.

FastAPI
High-performance async API framework
SQLAlchemy
Python SQL toolkit and ORM
Celery
Distributed task queue
Redis
In-memory data store & broker
Docker
Containerized benchmark sandbox
Chart.js
Interactive data visualizations

Deploy Your Own Instance

Run OpenBenchML anywhere — from your laptop to the cloud.

Docker Compose

The easiest way to get started. One command launches the full stack locally.

docker-compose up --build -d

Railway

Deploy to Railway with one click. Auto-detects Dockerfile and provisions PostgreSQL & Redis.

railway init
railway up

Render

Push to GitHub and let Render handle the rest. Managed DB and background workers included.

# render.yaml
services:
- type: web
runtime: docker

Ready to Benchmark?

Start comparing ML models with confidence. Open source, community-driven, and built for reproducibility.