Skip to main content
CodingAgentBench

Pick any two and see them side by side

Same task, same model, two different CLIs. The diff is the lesson.

What this page is

  • What you'll find: side-by-side trace and score comparison.
  • Who it's for: anyone deciding between two finalists.
  • How to switch: toggle to Nerds mode for the comparison interface.
Take me back home

Cell comparison

Compare cells

Pass run ids via ?ids=a,b,c to render a side-by-side panel. The cube explorer's compare-mark (key c) populates this view.

v0.1 · 2026-06-20