Skip to main content
AIDiveForge AIDiveForge
Visit RiddleRun

Get This Tool

License: License: unverified
Local-run terms: Clone repo, run via published Docker image with mounted volumes and environment variables; optional self-hosted web app via compose files.

Share This Tool

Compare This Tool
📋 Embed this tool on your site

Copy this code to embed a compact tool card:

RiddleRun

FreeOpen SourceSelf-HostedAgentic

Pricing

Model
Free

Summary

Maintaining E2E tests against a UI that keeps changing means your test scripts break faster than your engineers can fix them — RiddleRun replaces fragile selector-based scripts with an agent that reads your JSON-described user journey and navigates the browser on its own.

RiddleRun combines a CLI and an optional self-hosted web app, both running inside Docker, so your test environment travels with the repo rather than living on someone's laptop. You define a user journey in JSON — steps, assertions, expected outcomes — and a Playwright/browser-use agent executes the whole sequence autonomously. The Docker-first setup means teams can wire it into CI without installing a browser stack on the build machine. The project has two GitHub stars and one open issue at the time of curation, which signals early-stage maturity — documentation depth and community support are thin, and the agent's decision logic is largely a black box to the teams running it.

Bottom line: Pick RiddleRun for a self-hosted, AI-driven smoke test against a Wikipedia-style content site or any rapidly changing UI where rewriting selector-based tests every sprint is killing velocity — but plan for a Playwright-native alternative if your test suite needs deterministic assertions, fine-grained failure tracing, or enterprise support.

Community Performance Report Card

No community ratings yet. Be the first to rate this tool!

Best For: Teams needing AI-assisted E2E tests, Projects with rapidly changing UIs, Docker-based test environments, Wikipedia-style navigation and content verification

Community Benchmarks Community

No community benchmarks yet. Be the first to share a real-world data point.

  • JSON-defined test journeys decouple test authorship from code, so a product manager or QA analyst can write and update test cases without touching a Playwright script.
  • Docker-first deployment means the entire test environment — browser, agent, backend — is version-controlled and reproducible, so 'works on my machine' test failures stop being a sprint tax.
  • Autonomous agent execution adapts when UI elements shift position or change labels, so a redesign doesn't immediately invalidate your entire test suite the way selector-based tests do.
  • Fully open-source with no paid tier, so there is no usage ceiling, no API key cost, and no vendor lock-in — the full source is forkable and auditable.
  • Optional self-hosted web app alongside the CLI, so teams that want a visual interface for running and reviewing tests get one without leaving their own infrastructure.
  • Agent decision logic is opaque: when a test fails, the JSON output and logs do not currently expose a step-by-step trace of what the agent attempted, which means debugging a false negative on a critical checkout flow requires re-running the test manually and watching the browser — not reading a structured failure report.
  • The project carries two GitHub stars and one open issue at curation, which means there is precious little community knowledge to draw on when the agent misinterprets a journey step; teams hit a wall and wait on the single maintainer rather than searching a forum or Stack Overflow thread.
  • Complex assertion logic — verifying specific data values, confirming API responses correlate with UI state, or testing accessibility properties — is not described anywhere in the documented feature set; teams needing that depth will add a Playwright test layer alongside RiddleRun, at which point they are maintaining two systems.
  • Teams whose CI pipeline requires parallel test execution across multiple environments will find no documented support for distributed runs; at the point where a single Docker container's serial execution makes the test suite a bottleneck, the likely move is to a Playwright-native framework or a hosted AI testing service with built-in parallelism.

Community Reviews

No reviews yet. Be the first to share your experience.

About

Platforms
Docker, CLI, self-hosted web app
API Available
No
Self-Hosted
Yes
Last Updated
2026-06-11T06:15:22.070Z

Best For

Who it's for

  • Teams needing AI-assisted E2E tests
  • Projects with rapidly changing UIs
  • Docker-based test environments
  • Wikipedia-style navigation and content verification

What it does well

  • Automated end-to-end webpage testing
  • Agent-driven browser interaction verification
  • JSON-defined user journey execution
  • Self-hosted testing for teams

Integrations

Playwrightbrowser-useOpenAI

Discussion Community

No discussion yet. Sign in to start the conversation.

Compare RiddleRun

Spotted incorrect or missing data? Join our community of contributors.

Sign Up to Contribute

Community Notes & Tips Community

Be the first to contribute. General notes, observations, gotchas, and tips from people who use this tool day-to-day.

Frequently Asked Questions

Is RiddleRun free?
Yes — RiddleRun is fully free to use. There is no paid tier.
Is RiddleRun open source?
Yes. RiddleRun is open source.
Can I self-host RiddleRun?
Yes. RiddleRun supports self-hosting on your own infrastructure.
What platforms does RiddleRun support?
RiddleRun is available on: Docker, CLI, self-hosted web app.

Hours Saved & ROI Stories Community

Be the first to contribute. Concrete time/cost savings, with context. e.g. "Cut my code review backlog from 4h to 45m per week."

RiddleRun

RiddleRun is an open-source agentic browser testing tool that executes E2E test cases through a Playwright/browser-use agent rather than hand-written selector scripts. The core workflow is JSON-in, test-run-out: you describe a user journey as a JSON file, pass it to the CLI or the web UI, and the agent navigates a real browser to verify that journey end-to-end. Everything runs inside Docker, and the repo ships both a standard compose file and a production compose file so the path from local to server is a config swap, not a rebuild.

The differentiating feature is the autonomous agent execution loop. Instead of breaking when a button moves or a class name changes, the agent interprets intent from the JSON description and finds its own path through the page — the same way a human tester would adapt rather than quit. This makes RiddleRun a credible option for UIs under active development where conventional test scripts would require constant maintenance.

RiddleRun fits teams that want AI-assisted coverage without a hosted testing SaaS and are comfortable owning the infrastructure. It fits less well when you need deterministic pass/fail logic tied to specific DOM states, full debugging traces of what the agent clicked and why, or a mature plugin ecosystem. The project is at an early public stage — community reports and issue history are sparse — so teams that hit an edge case are largely on their own until the maintainer responds. Teams requiring audit-grade test logs or multi-environment parallelism are the most likely to exhaust what this tool currently offers and move to a maintained Playwright framework or a commercial AI testing platform.