Skip to main content
AIDiveForge AIDiveForge

Self-Hosted Test Generation

As of June 2026, AIDiveForge tracks 4 self-hosted test generation. Curated self-hosted test generation tracked by AIDiveForge. Listings are verified against each tool's live website and re-checked regularly.

Last updated June 9, 2026 · 4 tools

  1. Agent-QA

    1. Agent-QA

    The tool lets you write test steps in plain language — 'Click on the Create issue icon', 'Verify that the created issue is shown' — and an agent translates those into browser actions at runtime, reading visible labels and screen state instead of fragile CSS selectors. After each run, it builds execution memory: observations about navigation contracts, UI quirks, and previously healed steps, which get injected into future runs so the agent stops rediscovering the same UI patterns. Self-healing means that when a component shifts, the agent iterates through recovery attempts rather than failing immediately. The ceiling appears when test logic branches on conditional application state — the YAML authoring model is built for linear flows, and complex branching sends teams back to scripting.

    PaidOpen Source
  2. Bloom

    2. Bloom

    Bloom generates targeted evaluation suites for arbitrary behavioral traits.

    Free
  3. Catcher

    3. Catcher

    You describe tests in plain English, and Catcher's LLM-powered planner executes them in a real browser — no script authoring, no Selenium boilerplate. The vision-based fallback handles dynamic UIs where element selectors break, which is where most scripted test frameworks quietly start failing your CI. Because you supply the API key directly, LLM costs land on your own account — nothing is proxied through a vendor margin. The ceiling arrives when you need a test management dashboard, CI pipeline integrations, or a shared test artifact store across a team: the repo describes none of those, and you are building that infrastructure yourself.

    FreeOpen Source
  4. Maced AI

    4. Maced AI

    Maced deploys AI agents that crawl, fuzz, and attempt exploitation across your web apps, APIs, source code, and cloud infrastructure — then deliver audit-grade reports with proof-of-exploit payloads and merge-ready fix PRs. Every finding is auto-validated before it surfaces, which means triage queues shrink instead of growing. The continuous monitoring model means your attack surface is tested on every deploy, not just once a quarter. The ceiling shows up when your environment demands the kind of adversarial creativity a seasoned human tester brings to a novel business-logic flaw — agents that follow a structured probe loop will miss what only lateral thinking finds. Teams with that requirement use Maced for baseline and point a human at what the agents flag as high-severity.

    Paid

Listings on this page are sourced and verified by the AIDiveForge data pipeline. AIDiveForge is editorially independent — no money changes hands for inclusion.