Skip to main content
AIDiveForge AIDiveForge
Visit Runway

Get This Tool

License: MIT Any use incl. commercial
Local-run terms: MIT license permits commercial use, modification, and distribution with attribution and license preservation.

Share This Tool

Compare This Tool
📋 Embed this tool on your site

Copy this code to embed a compact tool card:

Runway

FreemiumOpen SourceAPISelf-HostedAgentic

Summary

You ran the agent, got a diff, and have no idea whether it actually passed your tests or just claimed to — because nothing in the loop required proof. Orbit exists to close that gap.

Orbit wraps agent runs in bounded execution cycles: one task selected from a dependency-ordered backlog, real test and lint gates that must pass before the task closes, and a structured artifact trail left after every run. You get four output files — agent result, rubric evaluation, a human-readable progress log, and an accept/iterate/stop recommendation — so you can audit what happened instead of re-running it from memory. The deterministic replay demo runs without an API key, which means you can inspect the full loop before wiring in Claude, Codex, or any other JSON-speaking CLI. The tool is intentionally scoped: it handles the harness, not the agent. Teams that need the agent itself to do more will hit that boundary fast.

Bottom line: Pick Orbit when you need reproducible, auditable evidence that an agent's work actually passed your test suite — not when you need the agent to do sophisticated multi-step reasoning that the harness itself cannot validate.

Pricing Plans

SubscriptionLast verified 2 days ago
Price
$12/mo
Free Tier
125 credits (one time), 3 video editor projects, 5GB asset storage, No Gen-4 Video

Free

Free

For individuals looking to explore Runway's AI Tools and content creation features.

  • 125 credits (one time)
  • 125 credits = 25s of Gen-4 Turbo or Gen-3 Alpha Turbo
  • Generative Video Gen-4 Turbo (Image to Video)
  • Generative Image Gen-4 (Text to Image, References)
  • Gemini 3 Pro
  • Gemini 2.5
  • Image Apps
  • Generative Audio
  • Text to Speech
  • Audio Apps
  • 3 video editor projects
  • 5GB asset storage
  • No Gen-4 Video

Pro

$28per month
$336/yr

For individuals and teams looking to add all of Runway's features into their workflows. Max. 10 users per workspace.

  • 2250 credits monthly
  • Create Custom Voices for Lip Sync and Text to Speech
  • 500GB asset storage
  • Everything in Standard

Max

$76per month
$912/yr

Best value for heavy usage and for experimenting. Max. 10 users per workspace.

  • 9500 credits monthly
  • Unused credits roll over 1 month
  • First access to newest models
  • Highest generation volume
  • Everything in Pro

Enterprise

Custom

For teams and organizations that need customization, advanced security and support.

  • Scalable for large organizations
  • All Pro Plan features
  • Single sign-on
  • Custom credit amounts
  • Configurable organization and team spaces
  • Advanced security and compliance
  • Enterprise-wide onboarding
  • Ongoing success program
  • Priority support
  • Integration with internal tools
  • Workspace Analytics

View full pricing on runwayml.com →

Pricing may have changed since last verified. Check the official site for current plans.

Community Performance Report Card

No community ratings yet. Be the first to rate this tool!

Best For: Teams evaluating AI coding agents experimentally, Projects requiring reproducible, auditable agent runs, Repositories with automated test and lint validation, Harness engineers prototyping agent workflows

Community Benchmarks Community

No community benchmarks yet. Be the first to share a real-world data point.

  • Validation gates require tests, lint, and type checks to pass before a task closes, which means you get actual proof of correctness instead of an agent's self-reported success.
  • Four structured output artifacts per run (agent result, rubric evaluation, review recommendation, progress log), so you can audit any run after the fact without re-executing it.
  • Agent-neutral contract — any JSON-speaking CLI plugs in behind the same harness — so comparing two coding agents means inspecting their artifacts under identical conditions instead of running separate experiments with no shared baseline.
  • Dependency-aware backlog selection advances one verified task at a time, which means a broken task blocks downstream work rather than silently corrupting the next step.
  • The deterministic replay demo runs without an API key, so you can fully inspect the harness loop before committing any cloud API spend or credentials.
  • Orbit supplies the harness, not the agent, the tasks, or the test suite — teams without an existing automated test infrastructure will spend the first sprint writing prerequisites rather than running orbits.
  • The artifact schema and gate logic are defined by the harness contract; teams that need custom rubric dimensions or non-standard validation steps beyond tests, lint, and type checks will need to modify the project directly, since the vendor page describes no plugin or configuration surface for that.
  • The project is described as 'intentionally small' with no commercial offering and no roadmap published on the vendor page — teams that need SLA-backed support, managed hosting, or a maintained integration ecosystem will move to a commercial agent orchestration platform rather than maintain a fork of a small open-source harness.

Community Reviews

No reviews yet. Be the first to share your experience.

About

Platforms
Linux, macOS, Windows (Python)
API Available
Yes
Self-Hosted
Yes
Last Updated
2026-06-07T20:31:32.028Z

Best For

Who it's for

  • Teams evaluating AI coding agents experimentally
  • Projects requiring reproducible, auditable agent runs
  • Repositories with automated test and lint validation
  • Harness engineers prototyping agent workflows

What it does well

  • Testing and validating AI coding agents deterministically
  • Running self-healing repository workflows with test gates
  • Comparing multiple coding agents via artifact inspection
  • Executing dependency-ordered task backlogs with agent verification
  • Auditing agent behavior through structured progress logs

Integrations

ClaudeCodexCursorcustom JSON-speaking CLI agents

Discussion Community

No discussion yet. Sign in to start the conversation.

Spotted incorrect or missing data? Join our community of contributors.

Sign Up to Contribute

Community Notes & Tips Community

Be the first to contribute. General notes, observations, gotchas, and tips from people who use this tool day-to-day.

Frequently Asked Questions

Is Runway free?
Runway is a paid tool ($12/mo). No permanent free tier is offered.
Is Runway open source?
Yes. Runway is open source.
Does Runway have an API?
Yes. Runway exposes a developer API. See the official documentation at https://runwayml.com for details.
Can I self-host Runway?
Yes. Runway supports self-hosting on your own infrastructure.
What platforms does Runway support?
Runway is available on: Linux, macOS, Windows (Python).

Hours Saved & ROI Stories Community

Be the first to contribute. Concrete time/cost savings, with context. e.g. "Cut my code review backlog from 4h to 45m per week."

Runway

Most agent runs leave you with a diff and a hope. Orbit structures that run into what the vendor calls an ‘orbit’: the harness selects one task from a dependency-ordered backlog, hands it to whatever coding agent you’ve configured, runs your tests, lint, and type checks as validation gates, and only closes the task if the agent can prove the result. Every run writes four artifacts — a structured agent-result.json, a rubric-scored evaluation.json, a review.json with an accept/iterate/stop recommendation, and a human-readable progress.md — so the evidence trail is inspectable long after the run completes.

The differentiating feature is the gate-and-artifact contract, not the agent. Orbit is described on the vendor page as agent-neutral: Claude, Codex, Cursor, or any CLI that speaks JSON can be dropped behind the same harness. That means you compare agents by inspecting their artifacts under identical conditions instead of relying on anecdotal impressions from different sessions. The vendor explicitly calls out ‘adapter experiments’ as a supported workflow — swap the agent, replay the same task, diff the evaluation scores.

Orbit fits narrowly: repositories with real automated test suites, teams that need reproducible audit trails for agent behavior, and harness engineers prototyping how agents handle a backlog. It does not supply the agent, the tasks, or the test suite. Teams that have none of those pieces in place will be building the prerequisites before Orbit adds value. The vendor page describes the project as ‘intentionally small,’ which is an honest signal about scope — contributions are welcomed specifically when they make the harness easier to verify or replay, not when they expand what the harness does.

The deterministic replay demo — invoked with MOCK=1 ./replay.sh auth-rescue — requires no API key and demonstrates the full selection, validation, and artifact-recording loop locally. The project is MIT licensed with no paid tier described on the vendor page.