Skip to main content
AIDiveForge AIDiveForge
Visit Flightdeck

Get This Tool

License: Apache-2.0 Any use incl. commercial
Local-run terms: Self-host via Docker Compose or install Python sensor package; full source available under Apache-2.0

Share This Tool

Compare This Tool
📋 Embed this tool on your site

Copy this code to embed a compact tool card:

Flightdeck

FreeOpen SourceSelf-Hosted

Pricing

Model
Free

Summary

You launched three agents in production and have no idea which one made that expensive API call at 2am, or why it kept retrying. Flightdeck is a self-hosted observability and control plane built specifically for that gap.

Every LLM call, MCP event, and tool invocation your agents make streams to a live dashboard — per-agent timelines and a fleet-wide feed, not batched logs you dig through after the incident. The vendor describes token budgets and MCP allow/block rules you set before problems hit, plus the ability to issue live directives to running agents without restarting them. The self-hosted, Apache-2.0 model means no telemetry leaves your infrastructure — critical for teams in regulated environments or those burned by SaaS observability vendors billing by event volume. The project is early-stage by star count, and the operational surface you take on by self-hosting is real.

Bottom line: Pick this when you are running Claude Code or Python agents in production and need fleet-wide visibility without shipping your traces to a third party — but plan for operational overhead if your team has never run a self-hosted observability stack before.

Community Performance Report Card

No community ratings yet. Be the first to rate this tool!

Best For: Teams running multiple AI agents in production, Developers using Claude Code or Python-based agents, Self-hosted observability needs without external dependencies

Community Benchmarks Community

No community benchmarks yet. Be the first to share a real-world data point.

  • Real-time per-agent timeline and fleet-wide feed, so you see which agent made which call as it happens rather than reconstructing the sequence from logs after a production incident.
  • Token budgets and MCP allow/block rules configurable before agents run, which means a misconfigured agent hits a policy ceiling instead of draining your API budget overnight.
  • Live directive issuance to running agents, so you can redirect or constrain an agent mid-execution without tearing down and restarting the process.
  • Apache-2.0 license with full self-hosted deployment via Docker and Helm, which means your agent traces and tool call data never leave your infrastructure — critical for teams under data residency or compliance constraints.
  • Purpose-built for agent observability rather than adapted from generic APM tooling, so the data model matches what agents actually produce: LLM calls, MCP events, tool invocations — not HTTP spans and database queries.
  • The project carries a small community footprint and limited commit history, which means edge-case debugging falls entirely on your team — when an ingestion pipeline drops events under high agent concurrency, there is no community thread to reference and no vendor support to call.
  • Self-hosting the full microservices stack (ingestion, workers, API, dashboard, sensor) means your platform team is responsible for uptime, upgrades, and failure recovery — teams without dedicated infrastructure capacity find themselves maintaining the observability layer instead of the product, and that is the point where they evaluate managed SaaS alternatives like LangSmith or Langfuse.
  • No API surface is described in the scraped documentation, which means you cannot build automated alerting pipelines or integrate fleet metrics into your existing incident management tooling without forking the project or building against undocumented internals.

Community Reviews

No reviews yet. Be the first to share your experience.

About

Platforms
Docker, Python
API Available
No
Self-Hosted
Yes
Last Updated
2026-06-12T23:39:19.519Z

Best For

Who it's for

  • Teams running multiple AI agents in production
  • Developers using Claude Code or Python-based agents
  • Self-hosted observability needs without external dependencies

What it does well

  • Real-time monitoring of LLM calls and tool usage by AI agents
  • Setting budgets and rules for production agent fleets
  • Live oversight and directive issuance to running agents

Integrations

Claude CodeAnthropic Python client

Discussion Community

No discussion yet. Sign in to start the conversation.

Compare Flightdeck

Spotted incorrect or missing data? Join our community of contributors.

Sign Up to Contribute

Community Notes & Tips Community

Be the first to contribute. General notes, observations, gotchas, and tips from people who use this tool day-to-day.

Frequently Asked Questions

Is Flightdeck free?
Yes — Flightdeck is fully free to use. There is no paid tier.
Is Flightdeck open source?
Yes. Flightdeck is open source.
Can I self-host Flightdeck?
Yes. Flightdeck supports self-hosting on your own infrastructure.
What platforms does Flightdeck support?
Flightdeck is available on: Docker, Python.

Hours Saved & ROI Stories Community

Be the first to contribute. Concrete time/cost savings, with context. e.g. "Cut my code review backlog from 4h to 45m per week."

Flightdeck

When your agent fleet grows past one or two instances, the failure modes stop being ‘did it run’ and start being ‘what did it call, how many tokens did it burn, and which tool invocation triggered the retry loop at 3am.’ Flightdeck addresses this by ingesting every LLM call, MCP event, and tool call in real time, surfacing each as a per-agent timeline and a live fleet-wide feed on a dashboard you run inside your own infrastructure. The Docker quickstart and Helm charts described in the repo mean the path from clone to running dashboard is a single compose command, not a multi-day integration project.

The control plane side is what separates this from a logging aggregator. The docs describe token budgets and MCP allow/block rules you configure ahead of time, so an agent that starts hammering an expensive endpoint hits a wall rather than a surprise invoice. Beyond static rules, the vendor describes live directive issuance to agents already running — you intervene without a redeploy. For teams operating Claude Code agents or custom Python agents in production, this is the difference between watching a runaway process and stopping it.

Flightdeck fits teams who have already made the decision to self-host their observability stack and want something purpose-built for agents rather than bolted onto a generic APM tool. It fits less well for teams who want a managed SaaS with no infrastructure to maintain — the project offers no hosted offering, and you own the uptime. With 80 commits and a small community footprint at the time of curation, the project carries the operational risk of any early-stage open-source tool: sparse documentation on edge cases, limited community troubleshooting resources, and the possibility of breaking changes between releases.

The repository structure includes separate services for ingestion, workers, a dashboard, an API layer, and a sensor — a microservices decomposition that gives you flexibility but means more moving parts to monitor. A Claude plugin and MCP registry integration are described in the repo structure, which aligns with the stated Claude Code use case.