OLAP cube · 3,465 cells

Slice the data any way you want

Name: CodingAgentBench Sweep Results
Published: 2026-06-20
License: https://creativecommons.org/licenses/by/4.0/

All 3,465 cells. Pick a TUI, a model, a task family. Get instant answers.

What this page is

What you'll find: every (CLI x model x task) cell scored on quality, latency, and cost.
Who it's for: people picking a CLI and model combo that wins on their workload.
How to switch: toggle to Nerds mode for the full pivot table, heatmap, and brush filters.

Top three winners

opencode + openai/gpt-oss-120b wins 8/25 tasks at $0.0000 per task and 14.8s wall time (strongest on polyglot: 7).
pi + openai/gpt-oss-120b wins 5/25 tasks at $0.0000 per task and 12.6s wall time (strongest on polyglot: 4).
copilot + mistralai/mistral-small-4-119b-2603 wins 2/25 tasks at $0.0000 per task and 10.7s wall time (strongest on mutations: 1).

A combo wins a task when its composite score (we call it PBS) is the highest of any CLI on that task. Cost is a $1.0 / MTok proxy on tokens-per-correct (we call it the flat-rate roll-up); real per-token prices land via harness/tracing/cost.py.

Where the wins land

cli / model Qwen3.5 GPT-OSS 120B Llama 3.3 70B

Aider

Crush

Goose

opencode

OpenHands

Plandex

Qwen-Code

clis: 7
models: 3
tasks: 25
peak: 1

Each cell is one (CLI, model) pair. Darker cells win more tasks. The full cube has 25 tasks across three categories (polyglot, mutations, integrity).

Take me back home

OLAP cube · DuckDB-Wasm · 3,465 cells

Cube Explorer

Slice the benchmark cube on TUI × model × task × plugin-stack. Switch between table, heatmap, scatter, and parallel coordinates. The URL is the source of truth — every view is shareable. Press ? for shortcuts.

v0.1 · 2026-06-20

Methodology v0.1 | Pinned to image digests as of 2026-06-20

3465 rows · 10 TUIs · 14 models · 3 categories

Slice the data any way you want

All 3,465 cells. Pick a TUI, a model, a task family. Get instant answers.

What this page is

Top three winners

Where the wins land

Cube Explorer

Keyboard shortcuts

Navigation

Command palette

Page