OpenBenchML — Open Source ML Model Benchmarking Platform

Features

Everything You Need to Benchmark ML Models

A comprehensive platform for fair, reproducible, and secure model evaluation — from upload to leaderboard.

Multi-Framework Support

Benchmark models from PyTorch, TensorFlow, scikit-learn, XGBoost, LightGBM, and more. Upload your model and let the platform handle the rest.

Docker Sandbox

Every benchmark runs in an isolated Docker container. No side effects, no resource leaks, and guaranteed reproducibility across runs.

Real-time Leaderboards

Instantly compare models side-by-side with live-updating leaderboards. Sort by accuracy, F1, latency, throughput, or any custom metric.

Performance Metrics

Track accuracy, precision, recall, F1, AUC-ROC, inference time, memory usage, and more. Rich visualizations powered by Chart.js.

REST API

Full-featured API built on FastAPI with automatic OpenAPI docs. Integrate benchmarking into your CI/CD pipeline or custom tooling.

Async Processing

Celery-powered async task queue with Redis backend. Submit benchmarks and get notified when results are ready — no blocking.

Quick Start

Up and Running in Minutes

Clone, configure, and launch your benchmarking platform with just a few commands.

1

Clone the Repository

Get the source code from GitHub and navigate into the project directory.

2

Configure Environment

Copy the example .env file and set your database URL, Redis connection, and secret key.

3

Launch with Docker Compose

Spin up the entire stack — API, worker, Redis, and database — with a single command.

4

Start Benchmarking

Open the dashboard at localhost:8000, upload a model, select a dataset, and hit benchmark!

Terminal

# Clone the repository
git clone https://github.com/kartheekbvs/openbenchml.git
cd openbenchml

# Configure environment
cp .env.example .env

# Launch the full stack
docker-compose up --build -d

# Access the platform
# Dashboard: http://localhost:8000
# API Docs:  http://localhost:8000/docs

Python SDK Example

import requests

# Submit a benchmark job
response = requests.post(
    "http://localhost:8000/api/benchmark",
    json={
        "model_id": "my-pytorch-model",
        "dataset_id": "mnist-test",
        "metrics": ["accuracy", "f1_score"]
    }
)

job = response.json()
print(f"Job submitted: {job['job_id']}")

API Reference

Powerful REST API

Full-featured API with automatic documentation. Integrate benchmarking into any workflow.

GET /api/models List all uploaded models

{ "models": [ { "id": "resnet50-v1", "name": "ResNet-50", "framework": "pytorch", "uploaded_at": "2025-01-15T10:30:00Z", "status": "ready" } ], "total": 1, "page": 1 }

POST /api/benchmark Submit a benchmark job

{ "job_id": "bench-a7f3c2e1", "model_id": "resnet50-v1", "dataset_id": "imagenet-val", "status": "queued", "metrics": ["accuracy", "f1_score", "latency"], "estimated_time": "2-5 min" }

GET /api/leaderboard Get ranked model results

{ "leaderboard": [ { "rank": 1, "model": "ResNet-50", "accuracy": 0.963, "f1_score": 0.958, "latency_ms": 12.4 } ], "dataset": "imagenet-val", "sort_by": "accuracy" }

GET /api/jobs/{job_id} Check benchmark job status

{ "job_id": "bench-a7f3c2e1", "status": "completed", "progress": 100, "results": { "accuracy": 0.963, "f1_score": 0.958, "latency_ms": 12.4, "memory_mb": 245 }, "completed_at": "2025-01-15T10:35:22Z" }

DELETE /api/models/{model_id} Remove a model and its results

{ "message": "Model deleted successfully", "model_id": "resnet50-v1", "deleted_jobs": 3 }

Architecture

How It Works

A clean, scalable architecture designed for secure and reproducible ML benchmarking.

Client Layer

Web UI

Dashboard

Chart.js Visualizations

→

API Client

REST API

OpenAPI / Swagger

↓

Application Layer

Framework

FastAPI

Routes & Auth

→

ORM

SQLAlchemy

Models & Migrations

→

Tasks

Celery

Async Job Queue

↓

Infrastructure Layer

Database

PostgreSQL

Models & Results

→

Broker

Redis

Queue & Cache

→

Runtime

Docker

Sandboxed Benchmarks

Tech Stack

Built With Proven Technologies

Leveraging the best of the Python and cloud-native ecosystem for reliability and performance.

FastAPI

High-performance async API framework

SQLAlchemy

Python SQL toolkit and ORM

Celery

Distributed task queue

Redis

In-memory data store & broker

Docker

Containerized benchmark sandbox

Chart.js

Interactive data visualizations

Deploy

Deploy Your Own Instance

Run OpenBenchML anywhere — from your laptop to the cloud.

Docker Compose

The easiest way to get started. One command launches the full stack locally.

docker-compose up --build -d

Railway

Deploy to Railway with one click. Auto-detects Dockerfile and provisions PostgreSQL & Redis.

railway init
railway up

Render

Push to GitHub and let Render handle the rest. Managed DB and background workers included.

# render.yaml
services:
- type: web
runtime: docker

Ready to Benchmark?

Start comparing ML models with confidence. Open source, community-driven, and built for reproducibility.

Star on GitHub Read the Docs