How We Rate AI Tools

Every listing on AIDiveForge earns its place. Tools pass through an automated verification pipeline and a human review before going live. Rated tools get a 1–10 score on nine real-world criteria. This page explains exactly how that works — so you can decide how much to trust what you read here.

How tools are discovered

New tools enter the system in two ways:

Daily automated discovery. Automated pipelines (powered by AIFlow) scan curated sources for new AI tool candidates.
On-demand submissions. A webhook endpoint accepts targeted submissions from editorial research and from reader tips.

Every candidate starts as an unpublished draft. Nothing goes live on discovery alone.

How data is verified

Before a tool becomes a public listing, its homepage is fetched and checked for reachability. The actual page content — not a summary, not a cached description — is passed to a locally-hosted large language model (Ollama running qwen2.5:7b) which extracts a defined set of structured fields:

Company or vendor name
Pricing model and visible price points
Free trial length in days, if offered
Supported platforms (web, macOS, Windows, Linux, iOS, Android, API)
Whether a public API is advertised
Open-source status

The model is explicitly instructed to extract only what is verifiable from the page text. If a field can’t be confirmed from the actual page, it’s left empty rather than guessed.

Each listing stores an internal _adf_verified_fields list that records which fields were confirmed directly from the live website versus inferred or left blank. This is how we keep ourselves honest.

When a tool claims to be open source, the assertion is checked independently against the GitHub API. If there is no public repository, or the repository doesn’t exist, the open-source flag is cleared regardless of what the vendor’s website says.

What we deliberately don’t include

A lot of AI directories publish performance numbers — accuracy scores, hallucination rates, tokens per second, cost per million tokens — that were generated by asking an LLM to guess. We don’t.

Performance metrics are only kept for tools with publicly published benchmarks. If a vendor has not published real numbers in a real place, the fields are cleared so no one is misled by AI-fabricated figures.
Unverified claims are not repeated. “Industry-leading,” “most accurate,” and “10x faster” don’t appear in our listings unless there’s a citation behind them.
No affiliate-padded rankings. Listings are not ordered by who pays us, because nobody pays us.

How freshness is maintained

Every listing records a last-updated timestamp. The discovery and verification pipeline re-runs on a schedule, re-fetching homepages and re-checking fields. When a vendor changes pricing, drops a platform, or goes offline, the listing reflects that on the next run.

Tool preview cards (the 600×400 images you see on category pages) are rendered server-side by our own card service using the tool’s real favicon, so they stay in sync with the actual brand rather than drifting over time.

The Editorial Rating: nine criteria, 1–10 each

On top of the verification pipeline above, every tool we’ve personally reviewed gets an Editorial Rating — a 1–10 score on each of nine criteria, displayed as a radar chart and a row of bars on the tool’s detail page. The overall score shown in big gradient text is the simple average of the nine. We publish the individual numbers so you can see where a tool is strong, not just whether it earned four stars.

Ease of Use — How quickly a brand-new user can become productive.
Output Quality — Accuracy, polish, and usefulness of generated results, measured against best-in-class.
Pricing Value — What you get per dollar versus comparable tools.
Feature Depth — Breadth and sophistication of capabilities (deep specialization counts).
Documentation — Clarity, completeness, and freshness of official docs.
Support — Responsiveness and quality of help channels.
Integration Ecosystem — Native connectors, public APIs, and third-party reach.
Performance — Speed, reliability, and uptime in real-world use, weighted over advertised benchmarks.
Update Cadence — Frequency and substance of product updates as a leading indicator of viability.

Listings without an editorial rating are tools that have entered the directory through the verification pipeline but haven’t yet been hand-reviewed. We don’t score tools we haven’t actually used, and we don’t take payment for higher scores — sponsored placements (when they exist) are labeled as such and have zero influence on the rating.

Corrections and feedback

If you spot something wrong — outdated pricing, a broken link, a misclassified category, a tool listed as open source that isn’t — please tell us. Email hello@aidiveforge.com and we’ll fix it. Vendors are welcome to request corrections to their own listings; we don’t charge for that and we don’t require anything in return.

Last reviewed

This methodology was last reviewed in June 2026.