Get This Tool
Flightdeck
Pricing
- Model
- Free
Summary
You launched three agents in production and have no idea which one made that expensive API call at 2am, or why it kept retrying. Flightdeck is a self-hosted observability and control plane built specifically for that gap.
Every LLM call, MCP event, and tool invocation your agents make streams to a live dashboard — per-agent timelines and a fleet-wide feed, not batched logs you dig through after the incident. The vendor describes token budgets and MCP allow/block rules you set before problems hit, plus the ability to issue live directives to running agents without restarting them. The self-hosted, Apache-2.0 model means no telemetry leaves your infrastructure — critical for teams in regulated environments or those burned by SaaS observability vendors billing by event volume. The project is early-stage by star count, and the operational surface you take on by self-hosting is real.
Bottom line: Pick this when you are running Claude Code or Python agents in production and need fleet-wide visibility without shipping your traces to a third party — but plan for operational overhead if your team has never run a self-hosted observability stack before.
Community Performance Report Card
No community ratings yet. Be the first to rate this tool!
Community Benchmarks Community
Sign in to submit a benchmarkNo community benchmarks yet. Be the first to share a real-world data point.
Pros
Sign in to edit- Real-time per-agent timeline and fleet-wide feed, so you see which agent made which call as it happens rather than reconstructing the sequence from logs after a production incident.
- Token budgets and MCP allow/block rules configurable before agents run, which means a misconfigured agent hits a policy ceiling instead of draining your API budget overnight.
- Live directive issuance to running agents, so you can redirect or constrain an agent mid-execution without tearing down and restarting the process.
- Apache-2.0 license with full self-hosted deployment via Docker and Helm, which means your agent traces and tool call data never leave your infrastructure — critical for teams under data residency or compliance constraints.
- Purpose-built for agent observability rather than adapted from generic APM tooling, so the data model matches what agents actually produce: LLM calls, MCP events, tool invocations — not HTTP spans and database queries.
Cons
Sign in to edit- The project carries a small community footprint and limited commit history, which means edge-case debugging falls entirely on your team — when an ingestion pipeline drops events under high agent concurrency, there is no community thread to reference and no vendor support to call.
- Self-hosting the full microservices stack (ingestion, workers, API, dashboard, sensor) means your platform team is responsible for uptime, upgrades, and failure recovery — teams without dedicated infrastructure capacity find themselves maintaining the observability layer instead of the product, and that is the point where they evaluate managed SaaS alternatives like LangSmith or Langfuse.
- No API surface is described in the scraped documentation, which means you cannot build automated alerting pipelines or integrate fleet metrics into your existing incident management tooling without forking the project or building against undocumented internals.
Community Reviews
Sign in to write a reviewNo reviews yet. Be the first to share your experience.
About
- Platforms
- Docker, Python
- API Available
- No
- Self-Hosted
- Yes
- Last Updated
- 2026-06-12T23:39:19.519Z
Best For
Who it's for
- Teams running multiple AI agents in production
- Developers using Claude Code or Python-based agents
- Self-hosted observability needs without external dependencies
What it does well
- Real-time monitoring of LLM calls and tool usage by AI agents
- Setting budgets and rules for production agent fleets
- Live oversight and directive issuance to running agents
Integrations
Discussion Community
Sign in to commentNo discussion yet. Sign in to start the conversation.
Compare Flightdeck
Spotted incorrect or missing data? Join our community of contributors.
Sign Up to ContributeCommunity Notes & Tips Community
Sign in to contributeBe the first to contribute. General notes, observations, gotchas, and tips from people who use this tool day-to-day.
Frequently Asked Questions
- Is Flightdeck free?
- Yes — Flightdeck is fully free to use. There is no paid tier.
- Is Flightdeck open source?
- Yes. Flightdeck is open source.
- Can I self-host Flightdeck?
- Yes. Flightdeck supports self-hosting on your own infrastructure.
- What platforms does Flightdeck support?
- Flightdeck is available on: Docker, Python.
Hours Saved & ROI Stories Community
Sign in to contributeBe the first to contribute. Concrete time/cost savings, with context. e.g. "Cut my code review backlog from 4h to 45m per week."
Curated lists that include this category
When your agent fleet grows past one or two instances, the failure modes stop being ‘did it run’ and start being ‘what did it call, how many tokens did it burn, and which tool invocation triggered the retry loop at 3am.’ Flightdeck addresses this by ingesting every LLM call, MCP event, and tool call in real time, surfacing each as a per-agent timeline and a live fleet-wide feed on a dashboard you run inside your own infrastructure. The Docker quickstart and Helm charts described in the repo mean the path from clone to running dashboard is a single compose command, not a multi-day integration project.
The control plane side is what separates this from a logging aggregator. The docs describe token budgets and MCP allow/block rules you configure ahead of time, so an agent that starts hammering an expensive endpoint hits a wall rather than a surprise invoice. Beyond static rules, the vendor describes live directive issuance to agents already running — you intervene without a redeploy. For teams operating Claude Code agents or custom Python agents in production, this is the difference between watching a runaway process and stopping it.
Flightdeck fits teams who have already made the decision to self-host their observability stack and want something purpose-built for agents rather than bolted onto a generic APM tool. It fits less well for teams who want a managed SaaS with no infrastructure to maintain — the project offers no hosted offering, and you own the uptime. With 80 commits and a small community footprint at the time of curation, the project carries the operational risk of any early-stage open-source tool: sparse documentation on edge cases, limited community troubleshooting resources, and the possibility of breaking changes between releases.
The repository structure includes separate services for ingestion, workers, a dashboard, an API layer, and a sensor — a microservices decomposition that gives you flexibility but means more moving parts to monitor. A Claude plugin and MCP registry integration are described in the repo structure, which aligns with the stated Claude Code use case.
