Get This Tool
Patina
Summary
AI coding agents generate plausible-looking diffs that pass a demo and break a CI pipeline — because nothing in the loop required them to prove the work before moving on. Orbit is the harness that closes that gap.
Orbit wraps each agent task in a bounded loop: the agent works, validation runs (tests, lint, type checks), and the task only closes when the checks pass. Every loop leaves structured JSON artifacts — what the agent returned, how it scored against a rubric, and a human-readable recommendation to accept, retry, or stop. This makes agent runs auditable after the fact, not just observable in the moment. The ceiling appears when your project needs multi-agent coordination or a hosted execution layer — Orbit is deliberately narrow, self-hosted only, and ships no managed runtime.
Bottom line: Pick Orbit when you need a reproducible, evidence-backed harness for iterating on AI coding agents locally or on-premise; look elsewhere when you need hosted agent scheduling, multi-agent parallelism, or anything beyond a single-task validation loop.
Pricing Plans
FreeOpen Source
MIT licensed, fully open-source harness for AI coding agents
- Bounded task validation
- Real validation gates
- Structured JSON artifacts
- Deterministic replay
- Agent-neutral design
- Durable progress logs
View full pricing on github.com →
Pricing may have changed since last verified. Check the official site for current plans.
Community Performance Report Card
No community ratings yet. Be the first to rate this tool!
Community Benchmarks Community
Sign in to submit a benchmarkNo community benchmarks yet. Be the first to share a real-world data point.
Pros
Sign in to edit- Validation gates block task closure until tests, lint, and type checks pass, so agents cannot self-report success on work that would fail your CI pipeline.
- Four structured artifacts per run (agent output, rubric evaluation, review recommendation, and progress log), which means audit trails exist by default instead of requiring you to reconstruct what happened from logs.
- Dependency-ordered backlog selection keeps each loop focused on one task at a time, so agents do not skip prerequisites or work on tasks whose dependencies are not yet verified.
- Agent-neutral adapter contract lets you swap Claude, Codex, Cursor, or any JSON-speaking CLI behind the same harness, so you compare agents on identical tasks with structured artifacts instead of anecdotes.
- MIT licensed and fully self-hosted, so teams with on-premise requirements or external platform restrictions can run the full harness without a managed dependency.
Cons
Sign in to edit- Orbit handles one task per loop; there is no mechanism for running agents in parallel or coordinating handoffs between agents. Teams whose workflows require concurrent agent execution build a separate scheduling layer on top — at which point they are maintaining two systems.
- The harness ships no hosted runtime, no API, and no managed execution environment. Teams that want cloud-hosted agent scheduling or need to trigger runs from external CI systems without standing up their own infrastructure will move to a platform that provides those primitives.
- The adapter and demo ecosystem is early-stage and contribution-dependent. Teams integrating a coding agent that lacks an existing adapter write and maintain the adapter themselves, which adds setup cost before the first validated loop runs.
Community Reviews
Sign in to write a reviewNo reviews yet. Be the first to share your experience.
About
- Platforms
- Python (via pip install), local execution, CLI
- API Available
- No
- Self-Hosted
- Yes
- Last Updated
- 2026-06-08T04:16:41.342Z
Best For
Who it's for
- Teams testing or iterating on AI coding agents in production-like environments
- Projects requiring auditable, reproducible agent execution with validation proof
- Comparing agent performance side-by-side with structured, comparable artifacts
- Harness engineering workflows that combine deterministic validation with agent flexibility
- Self-hosted or on-premise agent orchestration without external platform dependencies
What it does well
- Validating AI agent code generation against failing tests or lint rules before merge
- Running bounded agent tasks from a dependency-ordered backlog with durable progress tracking
- Comparing different coding agents (Claude, Codex, Cursor) on the same task with structured artifacts
- Building deterministic, auditable CI/CD workflows where agent work must pass real checks
- Iterative agent refinement with retry logic and evidence-based decision gates
Integrations
Discussion Community
Sign in to commentNo discussion yet. Sign in to start the conversation.
Compare Patina
Spotted incorrect or missing data? Join our community of contributors.
Sign Up to ContributeCommunity Notes & Tips Community
Sign in to contributeBe the first to contribute. General notes, observations, gotchas, and tips from people who use this tool day-to-day.
Frequently Asked Questions
- Is Patina free?
- Yes — Patina is fully free to use. There is no paid tier.
- Is Patina open source?
- Yes. Patina is open source.
- Can I self-host Patina?
- Yes. Patina supports self-hosting on your own infrastructure.
- What platforms does Patina support?
- Patina is available on: Python (via pip install), local execution, CLI.
Hours Saved & ROI Stories Community
Sign in to contributeBe the first to contribute. Concrete time/cost savings, with context. e.g. "Cut my code review backlog from 4h to 45m per week."
Curated lists that include this category
Most AI coding agent runs are fire-and-forget: the agent edits files, you check the diff, you guess whether it worked. Orbit replaces that guess with a structured loop. The harness selects a task from a dependency-ordered backlog, hands it to the agent, runs real validation (tests, lint, type checks), and only closes the orbit if the checks pass. Every run produces four artifacts — a structured agent-result.json, a rubric-scored evaluation.json, a review.json with an accept/iterate/stop recommendation, and a human-readable progress.md. The evidence trail is there whether you review it immediately or audit it a week later.
The differentiating constraint is the validation gate. The agent cannot self-report success. It has to prove completion through the checks you configure — failing tests, lint violations, and type errors are evidence the loop uses to decide whether work advances. The vendor describes this as ‘if the agent cannot prove it, the orbit does not close.’ That design makes Orbit useful for teams who have been burned by agents that return confident output on work that does not actually pass a real check.
Orbit is agent-neutral by contract: the docs describe support for Claude, Codex, Cursor, or any JSON-speaking CLI, and the contribution model is centered on swappable adapters. This makes it a practical bench for comparing agents on the same task with structured, comparable artifacts rather than subjective impressions. The tool is MIT licensed and self-hosted only — there is no managed runtime, no hosted scheduling layer, and no external platform dependency. A deterministic replay demo requires no API key and runs locally in minutes.
Where Orbit breaks: the harness is scoped to one task, one loop, one agent at a time. Teams that need agents running in parallel, cross-agent handoffs, or a hosted execution layer will hit the ceiling of what Orbit is designed to do. The project’s own documentation frames it as ‘intentionally small,’ and contributions are explicitly invited to make it ‘easier to verify, easier to replay, or easier to connect to another coding-agent workflow’ — not to expand it into a full orchestration platform.
