Open Source CLI Coding Agents

As of June 2026, AIDiveForge tracks 20 open source cli coding agents. Curated open source cli coding agents tracked by AIDiveForge. Each project has a verified public source repository. Listings are verified against each tool's live website and re-checked regularly.

Last updated June 12, 2026 · 20 tools

1. AgentKitten
Orbit selects a task from a dependency-ordered backlog, hands it to the configured agent adapter, runs tests, lint, and type checks against the result, and only advances the orbit when those gates pass. Every run writes four artifacts: structured agent output, rubric scoring, an accept-or-iterate recommendation, and a human-readable progress log. The workflow is agent-neutral — Claude, Codex, Cursor, or any adapter you wire up behind the same contract. Where it breaks: Orbit is intentionally minimal, so teams expecting a hosted dashboard, a GUI, or built-in multi-agent parallelism will find precious little of that. The harness is a loop, not a platform.
FreeOpen Source
2. AICTL
Each 'orbit' is one task: the harness selects it from a dependency-ordered backlog, runs the agent, then requires passing tests, lint, and type checks before closing the loop — no proof, no progress. Every run produces structured JSON artifacts (agent output, rubric scoring, a human-readable progress log) that you can inspect or replay without re-running the agent. The deterministic replay demo runs without an API key, so you can see the full cycle before wiring in a real model. Orbit is intentionally small — no hosted infrastructure, no GUI — which keeps it auditable and keeps you in control, but also means everything outside the core loop is your problem to build.
FreeOpen Source
3. AutoMaxFix
AutoMaxFix runs a detect-reproduce-repair loop: it watches for test failures or runtime drift, surfaces one ticket at a time, lets an AI agent propose a patch, and stops cold until a human approves it. That deliberate stop is the point. The vendor describes it explicitly as 'the boring opposite of an autonomous agent' — one ticket, one patch attempt, one approval, one report. Every fix is logged with provenance so you can trace what changed and why. The ceiling arrives fast: the tool handles one ticket per execution, so teams running parallel failure streams will need external orchestration to manage the queue.
FreeOpen Source
4. Codeep
Codeep is an open-source, terminal-native autonomous agent that reads your project structure, plans a sequence of steps, edits files, runs shell commands, and checks its own output against your build and test suite before declaring done. You describe the goal; it handles the steps. The self-verification loop — where it catches a broken typecheck and fixes it without prompting — is the part that separates it from a glorified shell wrapper. The ceiling appears on projects where the agent's context window fills before it has mapped the full dependency graph; community reports suggest large monorepos with deep cross-module dependencies push that limit faster than single-service repos. At that point, teams either scope tasks more tightly or reach for a dedicated sub-agent delegation pattern.
FreeOpen Source
5. CoreTex
Orbit pulls one dependency-ordered task at a time from your backlog, hands it to whichever coding agent you connect, then refuses to mark it done unless tests, lint, and type checks pass. Every run writes four JSON or markdown artifacts: what the agent returned, how the work scored against a rubric, a human-readable mission log, and a recommendation to accept, iterate, or stop. The agent-neutral contract means you can swap Claude for Codex behind the same harness and compare structured artifacts instead of vibes. The ceiling appears fast on large repos: Orbit is intentionally small, so teams needing parallel agent execution, complex branching between task types, or CI integration will find themselves extending the harness manually.
FreeOpen Source
6. Gito
Orbit wraps any JSON-speaking coding agent — Claude, Codex, Cursor, or your own — inside a loop that selects a dependency-ordered task, runs the agent, demands validation proof, and records every artifact before advancing. The output is structured JSON showing what the agent returned, rubric scoring for task focus and diff signal, and a human-readable mission log. Where it breaks: Orbit is intentionally small, which means teams that need hosted execution, a GUI, or a first-class CI/CD plugin will hit the boundary fast and find themselves wiring their own glue code. Teams experimenting with multiple agent frameworks get the most from it; teams shipping to production pipelines at scale will need to extend it.
FreeOpen Source
7. KugelAudio
Orbit wraps agent runs in a controlled loop: pick a task from a dependency-ordered backlog, hand it to whichever agent backend you have configured, run tests and lint against the output, and write inspectable JSON artifacts before the task is ever marked complete. If the agent cannot pass the validation gate, the orbit does not close — no silent failures, no optimistic merges. The artifact trail covers what the agent returned, how the run scored against a rubric, and a human-readable recommendation to accept, iterate, or stop. It runs fully self-hosted with no hosted option and no API key required for the replay demo.
FreeOpen Source
8. LocalCode
Type what you want, get a suggested command, approve it, and it runs — no API key, no network request, no telemetry. All inference runs on Apple Silicon through the Foundation Models framework, which means your file paths, hostnames, and search terms never travel anywhere. The workflow is strictly one-shot: one prompt, one command suggestion, one approval gate. There is no session memory, no chaining, and no multi-step automation. Teams that want anything beyond single-command suggestions will hit the ceiling of what this proof-of-concept was designed to do.
FreeOpen Source
9. MandoCode
MandoCode is a .NET CLI agent that reads your project, proposes diffs, and applies changes across files — the full plan-search-edit loop, entirely on your machine. It is built on Semantic Kernel and RazorConsole, which renders a Spectre.Console terminal UI using Razor components and a virtual DOM. The agent is designed around C# and .NET codebases, so the file understanding and diff proposals are tuned for that ecosystem. Web search is available without a key but the vendor states a free Tavily key improves reliability. The ceiling appears when you push outside .NET: community reports on the GitHub page are thin, and the tool's own framing is explicit about its target audience.
FreeOpen Source
10. Memex
Orbit runs as a local harness that pulls one dependency-ordered task at a time, hands it to whichever coding agent you configure, then runs your tests, lint, and type checks before recording the result. Every run writes structured JSON artifacts — what the agent returned, how the output scored against a rubric, and a human-readable recommendation to accept, iterate, or stop. The audit trail is durable and replayable without an API key, which makes it usable in air-gapped environments. The tooling is intentionally minimal, so teams building on top of it will write their own adapter glue for agents that do not speak the expected JSON contract. Orbit does not manage the agent itself — it manages what the agent must prove.
FreeOpen Source
11. Nanocode-CLI
The tool runs entirely in your terminal, talks to whatever LLM you point it at — local or remote — and edits files using line-and-hash anchors that reject a write if the target code has already drifted. That last detail matters more than it sounds: most agents will cheerfully overwrite a file that changed between the read and the write. nanocode refuses. The tradeoff is scope — the codebase is intentionally small, the feature surface is narrow, and teams who need a visual canvas, IDE integration, or a rich plugin ecosystem will hit the ceiling fast. For a restricted environment or a developer who wants to read every line of the agent loop before trusting it, that ceiling is the point.
FreeOpen Source
12. NodeCartel
Orbit wraps each coding agent run in a bounded loop: one task, validation gates (tests, lint, type checks), and a fixed set of JSON artifacts recording exactly what the agent returned, what the checks proved, and what should happen next. It is agent-neutral — Claude, Codex, Cursor, or any CLI that speaks JSON fits behind the same contract. The dependency-aware backlog means tasks run in order and only advance when the previous orbit closes cleanly. Where it stops: Orbit has no API and no dashboard, so teams that need live metrics or cross-run analytics build those themselves on top of the artifact files.
FreeOpen Source
13. Opencode
OpenCode is an open-source coding agent that runs in your terminal, a desktop app, or an IDE extension, connecting to 75+ LLM providers including local models. You can spin up multiple agents on the same project in parallel, share debug sessions via a link, and log in with your existing GitHub Copilot or ChatGPT Plus credentials rather than paying again. The no-data-storage architecture makes it viable in privacy-sensitive environments where cloud-only tools are ruled out. The ceiling shows up when you need validated, consistent model performance out of the box — that lives behind the paid Zen add-on, not in the free tier.
PaidOpen Source
14. Pi Coding Agent
Pi runs in a loop with full tool-calling access — read, write, edit, bash — and surfaces four modes: interactive TUI, print/JSON for scripting, RPC, and an SDK for deeper integration. Sessions are stored as trees, so you can rewind to any prior message, fork from that point, and share the entire branch as a rendered URL. The extension and skills system lets you load context on-demand rather than stuffing everything into the system prompt at startup — which the docs describe as a deliberate choice to stay token-efficient. Where Pi stops short is also deliberate: sub-agents and plan mode are not included by default, so teams that need multi-agent parallelism or structured planning build or install extensions themselves. That tradeoff keeps the core minimal, but it means the complexity budget shifts from the tool to you.
FreeOpen Source
15. Runway
Orbit wraps agent runs in bounded execution cycles: one task selected from a dependency-ordered backlog, real test and lint gates that must pass before the task closes, and a structured artifact trail left after every run. You get four output files — agent result, rubric evaluation, a human-readable progress log, and an accept/iterate/stop recommendation — so you can audit what happened instead of re-running it from memory. The deterministic replay demo runs without an API key, which means you can inspect the full loop before wiring in Claude, Codex, or any other JSON-speaking CLI. The tool is intentionally scoped: it handles the harness, not the agent. Teams that need the agent itself to do more will hit that boundary fast.
PaidOpen Source
16. SIMD Agent
Orbit is an MIT-licensed open-source harness that wraps any JSON-speaking CLI agent — Claude, Codex, Cursor, or otherwise — in a bounded loop: select one task from a dependency-aware backlog, run the agent, gate on real validation (tests, lint, type checks), and write inspectable artifacts before closing the orbit. Every run produces four JSON/markdown files recording what the agent returned, how the output scored against a rubric, whether to accept or iterate, and a human-readable mission log. The harness is intentionally small, so there is precious little abstraction to hide behind — what you see is what runs. Teams with strict audit requirements get durable, reviewable evidence without instrumenting the agent itself. The trade-off is that Orbit is a harness framework, not a turnkey product: you bring the agent, the backlog structure, and the validation suite.
FreeOpen Source
17. Skills
Orbit is a CLI harness that wraps any JSON-speaking coding agent — Claude, Codex, Cursor, or your own — in a bounded loop: one task selected from a dependency-ordered backlog, executed by the agent, then checked against tests, lint, and type validation before the orbit closes. If the agent cannot prove the work, the run does not advance. Every orbit writes structured JSON artifacts and a human-readable progress log, so you are reviewing evidence rather than re-reading diffs and guessing. The harness runs entirely locally, requires no API key for the replay demo, and is MIT licensed. Where it breaks: teams whose validation needs go beyond tests and lint — custom scoring rubrics, multi-step human approval workflows, or large parallel backlogs — will find the intentionally small surface area a ceiling rather than a feature.
FreeOpen Source
18. Supertonic
Orbit structures agent execution around a single concept: one task, one orbit, bounded by real checks — tests, lint, type validation — and recorded in inspectable JSON artifacts before anything advances. The vendor describes it as agent-neutral: Claude, Codex, Cursor, or any JSON-speaking CLI slots in behind the same contract, so teams can swap agents and compare output artifacts instead of gut feelings. The architecture is intentionally small, which means the harness is easy to verify and replay, but it also means Orbit does not ship workflow UI, cloud hosting, or a managed backlog service. Teams with complex multi-agent pipelines or a need for a hosted dashboard will be assembling those pieces themselves. Where it shines is the messy middle: failing tests handed to an agent, with proof required before the task closes.
FreeOpen Source
19. Unspaghettit
Orbit wraps each coding-agent invocation in a bounded loop: it selects a dependency-ordered task from a backlog, runs the agent, then gates advancement on passing tests, lint, and type checks — not on the agent's self-report. Every run writes structured JSON artifacts and a human-readable progress log, so you can inspect what changed and why a task closed or stalled. The deterministic replay demo runs without an API key, which means you can verify the harness behavior before committing any agent credits. The ceiling appears when your workflow needs anything beyond CLI-compatible agents — there is no API and no visual interface.
FreeOpen Source
20. WinkTerm
Orbit wraps each coding-agent run in a bounded loop: one task selected from a dependency-ordered backlog, executed by whatever CLI agent you hand it, then validated through tests, lint, and type checks before the orbit closes. Every run writes structured JSON artifacts — what the agent returned, how the diff scored, whether the reviewer should accept or iterate. This is not an agent itself; it is the scaffold that keeps agents accountable. The ceiling appears when your workflow needs dynamic replanning or multi-agent coordination across parallel tasks — Orbit's contract is deliberately single-focus, and teams that outgrow that boundary are maintaining a layer above the harness.
FreeOpen Source

Listings on this page are sourced and verified by the AIDiveForge data pipeline. AIDiveForge is editorially independent — no money changes hands for inclusion.