Open Source LLM Observability
As of June 2026, AIDiveForge tracks 5 open source llm observability. Curated open source llm observability tracked by AIDiveForge. Each project has a verified public source repository. Listings are verified against each tool's live website and re-checked regularly.
Last updated June 12, 2026 · 5 tools

1. AgentMeter
AgentMeter runs locally — no cloud sync, no account creation, no vendor dashboard to log into — and parses the tool calls, token counts, and caching splits that CLI agents like Claude Code, Gemini CLI, Codex CLI, and Copilot CLI generate. It surfaces the three-tier cost structure that prompt caching creates (input, cached-input, and output tokens each priced differently), which the raw API bill flattens into noise. The value-multiplier calculation compares API spend against estimated developer time saved, giving you a number to put in front of a manager. The wall appears when you need alerting, real-time budget enforcement, or integration with a team billing system — none of that is here.
FreeOpen Source
2. Beacon
Beacon is an open-source endpoint telemetry layer that runs locally alongside AI agents, capturing prompts, tool calls, file modifications, and approval workflows before any of that activity disappears into the void. It normalizes that telemetry and forwards it to SIEM platforms like Wazuh, Elastic, or Splunk, so security teams can apply the same detection logic they already run against the rest of the fleet. The architecture is self-hosted by design — no data leaves the endpoint unless you route it there yourself. The project is early-stage; the plugin ecosystem covers the major local agent harnesses but gaps exist for less common runtimes. Teams with agents not yet on the supported list write custom collector plugins — which means more surface area to maintain.
FreeOpen Source
3. Flightdeck
Every LLM call, MCP event, and tool invocation your agents make streams to a live dashboard — per-agent timelines and a fleet-wide feed, not batched logs you dig through after the incident. The vendor describes token budgets and MCP allow/block rules you set before problems hit, plus the ability to issue live directives to running agents without restarting them. The self-hosted, Apache-2.0 model means no telemetry leaves your infrastructure — critical for teams in regulated environments or those burned by SaaS observability vendors billing by event volume. The project is early-stage by star count, and the operational surface you take on by self-hosting is real.
FreeOpen Source
4. Selvedge
Selvedge is a local MCP server that AI coding agents (Claude Code, Cursor, Copilot) call as they work, logging the reasoning behind every change into a SQLite file that lives next to your code under .selvedge/. Queries are entity-scoped — you ask about users.email or deps/stripe, not line numbers — so the answer surfaces in the same terms you search in. The vendor describes zero telemetry, no accounts, and no external servers; everything stays on disk. The wall appears when your team needs cross-repo provenance or wants to pipe this data into an existing observability stack — Selvedge emits records but does not integrate with those systems out of the box.
FreeOpen Source
5. Spanlens
Spanlens sits in front of your LLM provider via a single baseURL change, recording every call's cost, latency, tokens, and full request-response body with no SDK rewrite required. Agent runs surface as waterfall span trees so you can identify the one step consuming 80% of wall-clock time. The model recommender flags GPT-4o calls that look like classification tasks and shows the cost delta if you swap — with numbers from your own traffic, not benchmarks. The eval and experiment layer lets you replay a fixed dataset across prompt versions before you ship, so quality regressions don't surprise you in production. PII scanning and anomaly detection run at log time, which matters when sensitive data crosses the wire at 3 a.m. with nobody watching.
PaidOpen Source
Listings on this page are sourced and verified by the AIDiveForge data pipeline. AIDiveForge is editorially independent — no money changes hands for inclusion.