Skip to main content
CodingAgentBench

The loser ledger

What didn't make the cut

We publish the tools we excluded so you can see what we considered and why we passed. Naming losers is part of being honest about the live matrix.

What this page is

  • What you'll find: the five tools we cite most often when readers ask why something is missing.
  • Who it's for: people checking whether their favorite CLI was overlooked or retired.
  • How to switch: toggle to Nerds mode for the full ledger, watchlist, and rejection criteria.

Top removals

  • Roo Code (RooCodeInc/Roo-Code)

    Maintainers archived the repo on 2026-05-15 after millions of VS Code installs; development continues in Kilo Code.

  • Gemini CLI (Google)

    Google retired it at I/O 2026-05-19 with a 2026-06-18 end-of-life; closed-source and model-bundled, so out of scope regardless.

  • Open Interpreter (OpenInterpreter/open-interpreter)

    Last meaningful release was 2024-10; the tool-call layer stalls on modern open-weight models like Qwen3-coder and GLM-4.6+.

  • AutoGPT (Significant-Gravitas/AutoGPT)

    Maintainers pivoted to a visual agent-builder through 2024-2025; the original shell-looping CLI is no longer a maintained entry point.

  • Continue (continuedev/continue)

    The project rebranded to a CI/PR-time agent product in 2025; the CLI still builds but no longer fits the controlled-matrix shape.

Take me back home

The TUI Graveyard

Public registry of TUI / CLI coding agents that CodingAgentBench evaluated, considered, or once measured, and that are now deprecated, abandoned, archived, pivoted, or otherwise no longer in scope.

Manifesto commitment #7 — Editorial nerve — requires that we name losers, not just winners. This page is the loser ledger. It is updated on every methodology revision.

Inclusion here is not a moral judgment on the maintainers — projects pivot, run out of money, get acquired, get archived. It is a factual judgment on whether the tool is currently a viable target for the controlled-matrix benchmark.

Each entry: name → status → date → what to use instead.


Archived / abandoned

Roo Code — RooCodeInc/Roo-Code

  • Status: Archived by maintainers, 2026-05-15.
  • What happened: Despite a very large install count (the VS Code extension counted in the millions at archival), the maintainers folded the codebase. The continuation is under the Kilo Code project (Kilo-Org/kilocode), which forks the Cline + Roo lineage and continues development.
  • What to use instead: Kilo Code. Currently a Phase 1 CodingAgentBench candidate; see tui-catalog.md.

Open Interpreter — OpenInterpreter/open-interpreter

  • Status: Effectively dormant. Last meaningful release October 2024.
  • What happened: ~63K GitHub stars as of 2026-05, but the tool-call layer is broken on most local models. Multiple open issues report that the executor stalls on Qwen2.5/3 tool calls and that the function-calling translation layer has not been updated for the modern tool-call formats (Qwen-coder, GLM-4.6+, DeepSeek-V3+). The project pivoted toward a desktop product and the open-source CLI has stagnated.
  • What to use instead: Aider (lighter, mature) or sst/opencode (modern, actively maintained). Both are in the MVP sweep.

AutoGPT — Significant-Gravitas/AutoGPT

  • Status: Pivoted away from TUI/CLI category.
  • Date of pivot: Gradual through 2024-2025; by 2026 the upstream project is a visual-workflow / agent-platform product. The "AutoGPT CLI" is no longer a maintained entry point.
  • What happened: The maintainers moved upstack to a low-code visual agent builder. The original "give an LLM a shell and watch it loop" CLI is no longer the project's product.
  • What to use instead: plandex-ai/plandex (multi-step planning CLI, MVP) or OpenHands (autonomous coding agent, MVP).

GPT-Pilot — Pythagora-io/gpt-pilot

  • Status: Explicitly "not maintained" per upstream README, as of 2026-05.
  • What happened: The maintainers state in the repo README that they are not actively developing the tool. The product team moved to Pythagora's hosted offering.
  • What to use instead: plandex-ai/plandex for the multi-step planning use case it once filled.

Devika — stitionai/devika

  • Status: Pivoted to "Opcode"; original project effectively dormant.
  • Date: Pivot announced late 2025; the original devika repo has had no substantive commits in months as of 2026-05.
  • What to use instead: All-Hands-AI/OpenHands occupies the same "fully autonomous coding agent" niche and is in our MVP.

Continue — continuedev/continue

  • Status: Pivoted out of the TUI category.
  • What happened: Continue rebranded toward "Continuous AI" — a CI/PR-time agent product. The original IDE-extension + CLI workflow is no longer the headline. The CLI remains buildable but is no longer the project's identity, and its config surface no longer maps cleanly to the BYO-model + open-TUI controlled experiment CodingAgentBench runs.
  • What to use instead: charmbracelet/crush or sst/opencode if you want the modern open-TUI experience.

Gemini CLI — Google

  • Status: Retired by vendor at Google I/O 2026-05-19. End-of-life deadline 2026-06-18.
  • What happened: Google announced that Gemini CLI is being replaced by their new "Antigravity" product. Users must migrate by 2026-06-18.
  • CodingAgentBench note: Gemini CLI was out of scope anyway — it is closed-source and model-bundled (vendor-locked to Gemini). Even if it had survived, it would not have been in the matrix. See manifesto-faq.md.
  • What to use instead: Any open-source MVP TUI. If you want Google-flavored, qwen-code or opencode both can be pointed at any OpenAI-compatible endpoint including Vertex AI.

On the watchlist

Not dead yet, but on track for graveyard. Listed here so readers know we are watching.

Smol Developer — smol-ai/developer

  • Status: Inactive. Last commit several months back as of 2026-05.
  • Likely fate: Will be added to graveyard at next methodology rotation if no maintainer surfaces.

MetaGPT — geekan/MetaGPT

  • Status: Active but increasingly focused on multi-agent simulation rather than the coding-TUI use case. May graduate out of scope by Phase 2.

What "graveyard" does NOT mean

  • It does not mean the original work was bad. Roo Code was important. Open Interpreter showed what was possible. AutoGPT shaped the field. The graveyard is a current-state ledger, not a quality verdict.
  • It does not mean we will refuse to re-add a project. If Open Interpreter ships a fixed tool-call layer for modern open-weight models, it can be reinstated by submitting through the standard docs/submitting.md process.
  • It does not mean the codebase has been deleted from the internet. Most of these repos are still readable, forkable, and useful as study material.

This page is maintained by 88plug AI Lab and is updated on every methodology version bump. Last updated for methodology v0.1, May 2026.