Is RunbookHermes free?

Yes — RunbookHermes is fully free to use. There is no paid tier.

Is RunbookHermes open source?

Yes. RunbookHermes is open source.

Does RunbookHermes have an API?

Yes. RunbookHermes exposes a developer API. See the official documentation at https://github.com/tommy-yw/runbookhermes for details.

Can I self-host RunbookHermes?

Yes. RunbookHermes supports self-hosting on your own infrastructure.

What platforms does RunbookHermes support?

RunbookHermes is available on: Linux, macOS, Docker, Kubernetes.

Visit RunbookHermes

Get This Tool

License: MIT Any use incl. commercial

Local-run terms: MIT-licensed source code may be freely used, modified, and distributed commercially or non-commercially, with retention of copyright and license notices.

Official Website

RunbookHermes

FreeOpen SourceAPISelf-HostedAgentic

Pricing

Model: Free

Summary

Incident response falls apart when the gap between 'something is wrong' and 'we know why' takes longer than the outage itself — and most on-call tooling just pages people faster without doing the diagnosis work. RunbookHermes is an MIT-licensed AIOps agent that closes that gap by autonomously correlating metrics, logs, and traces, proposing evidence-backed remediation, and requiring a human sign-off before anything executes.

The agent runs multi-signal diagnosis across observability data, builds a root-cause hypothesis, and generates or updates runbooks from what it learns — so the next incident with the same failure pattern starts from a documented baseline instead of a blank slate. The approval-gated remediation workflow means automated action doesn't ship without a reviewer, which matters when the blast radius is a production service. Where it breaks: the repo is five commits deep with zero open issues, which signals early-stage software, not battle-hardened infrastructure. Teams with complex multi-service topologies will hit integration gaps before the agent's reasoning does. Self-hosting is required, so operationalizing this adds a deployment and maintenance surface your platform team owns.

Bottom line: Pick RunbookHermes for an SRE team that wants an autonomous first-responder to triage and document incidents while a human stays in the loop — but expect to build integrations yourself if your observability stack is anything beyond what the repo ships with.

Community Performance Report Card

No community ratings yet. Be the first to rate this tool!

Best For: Organizations seeking autonomous incident response with human oversight, Teams wanting to reduce MTTR while maintaining safety, Engineering cultures that treat incidents as learning opportunities, Multi-service deployments with observability data integration, SRE and Platform Engineering teams

Community Benchmarks Community

No community benchmarks yet. Be the first to share a real-world data point.

Agent Frameworks Large Language Models

Added on June 9, 2026

Pros

Evidence-driven root-cause hypothesis before remediation is proposed, so the on-call engineer reviews a reasoned diagnosis instead of raw signal noise — which means sign-off decisions take seconds rather than requiring independent investigation.
Approval-gated execution model, so automated remediation actions cannot ship to production without a reviewer in the loop — which avoids the class of incidents caused by runaway automation acting on a misdiagnosis.
Runbook generation and learning from live incidents, so operational knowledge accumulates in structured documentation rather than living exclusively in the memory of whoever was paged — which matters when the person who handled the last incident is on vacation for the next one.
MIT license with full self-hosted deployment, so the agent and its incident data stay inside your own infrastructure — which removes the vendor-access and data-residency concerns that block AIOps adoption in regulated environments.
Multi-signal ingestion across metrics, logs, and traces, so the agent correlates evidence across observability layers rather than diagnosing from a single data source — which reduces false-positive root-cause conclusions from incomplete signal.

Cons

The repository has five commits and no closed issues, which means there is no public evidence of the agent performing correctly under real production incident load — teams that need a vetted tool before adoption will need to run their own failure-mode testing before trusting it on a live on-call rotation.
Integration coverage is bounded by what the observability MCP toolserver ships with; teams running Datadog, Honeycomb, or custom telemetry pipelines that fall outside that surface will write and maintain their own integration connectors — at which point they are owning a non-trivial piece of the agent's input layer.
There is no community or commercial support path documented in the repo; when the agent produces a wrong root-cause hypothesis or the approval workflow misbehaves at 3 AM, the escalation path is the GitHub repo and whatever institutional knowledge your team has built — teams that require SLA-backed support or vendor escalation will move to a commercial AIOps platform instead.

Community Reviews

No reviews yet. Be the first to share your experience.

About

Platforms: Linux, macOS, Docker, Kubernetes
API Available: Yes
Self-Hosted: Yes
Last Updated: 2026-06-09T05:56:26.804Z

Best For

Who it's for

Organizations seeking autonomous incident response with human oversight
Teams wanting to reduce MTTR while maintaining safety
Engineering cultures that treat incidents as learning opportunities
Multi-service deployments with observability data integration
SRE and Platform Engineering teams

What it does well

Production incident response and root-cause analysis
Evidence-driven remediation with human approval gates
Automated runbook generation and SRE knowledge capture
Multi-signal incident diagnosis from metrics, logs, and traces
Team training on fault patterns and operational procedures

Integrations

PrometheusLokiAlertmanagerFeishuWeComOpenAI-compatible model providersHermes Agent ecosystem

Discussion Community

No discussion yet. Sign in to start the conversation.

Compare RunbookHermes

Spotted incorrect or missing data? Join our community of contributors.

Community Notes & Tips Community

Be the first to contribute. General notes, observations, gotchas, and tips from people who use this tool day-to-day.

Frequently Asked Questions

Is RunbookHermes free?: Yes — RunbookHermes is fully free to use. There is no paid tier.
Is RunbookHermes open source?: Yes. RunbookHermes is open source.
Does RunbookHermes have an API?: Yes. RunbookHermes exposes a developer API. See the official documentation at https://github.com/tommy-yw/runbookhermes for details.
Can I self-host RunbookHermes?: Yes. RunbookHermes supports self-hosting on your own infrastructure.
What platforms does RunbookHermes support?: RunbookHermes is available on: Linux, macOS, Docker, Kubernetes.

Hours Saved & ROI Stories Community

Be the first to contribute. Concrete time/cost savings, with context. e.g. "Cut my code review backlog from 4h to 45m per week."

Curated lists that include this category

Incident diagnosis is the part of on-call that burns people out: pulling signals from three different dashboards at 2 AM, manually correlating a latency spike in traces with a log error from ten minutes earlier, then writing up a postmortem that nobody reads before the same pattern hits again. RunbookHermes addresses this as a Hermes-native AIOps agent: it autonomously ingests multi-signal observability data — metrics, logs, and traces — constructs an evidence-driven root-cause hypothesis, proposes a remediation action, and waits for a human to approve before executing. The runbook it produces from that incident becomes the starting point for the next one.

The defining feature is the approval-gated remediation loop. The agent does not act autonomously end-to-end — it reasons and proposes, then the reviewer decides. This is architecturally meaningful for production environments where autonomous execution without oversight is a liability, not a feature. Combined with runbook learning, the system is designed to accumulate operational knowledge from real incidents rather than requiring an SRE team to maintain documentation separately from the work that generates it.

RunbookHermes fits SRE and platform engineering teams who want to reduce mean time to resolution without removing human judgment from the remediation step. The repo includes a TUI gateway, web interface, observability MCP toolserver, ACP adapter, and plugin/skills architecture — indicating a modular design that supports extension. What the repo does not yet show is a track record at scale: with five commits and no closed issues, the gap between the architectural intent and production-hardened behavior is unknown. Teams running heterogeneous observability stacks should audit the integrations directory carefully before committing to this as a production dependency.

The project ships with Docker support, a Nix environment, Homebrew packaging, and an example environment configuration, so the self-hosting path is documented. The observability MCP toolserver in the repo is the integration surface for connecting the agent to live telemetry — the vendor describes this as Hermes-native, meaning the agent framework is purpose-built around this tool rather than layered on top of a generic agent SDK.

Get This Tool

RunbookHermes

Pricing

Summary

Community Performance Report Card

Community Benchmarks Community

Pros

Cons

Community Reviews

About

Best For

Who it's for

What it does well

Integrations

Discussion Community

Compare RunbookHermes

Community Notes & Tips Community

Frequently Asked Questions

Hours Saved & ROI Stories Community

Curated lists that include this category

ChatGPT

Breeze Customer Agent

MagesticAI

Get This Tool

Share This Tool

RunbookHermes

Pricing

Summary

Community Performance Report Card

Community Benchmarks Community

Pros

Cons

Community Reviews

About

Best For

Who it's for

What it does well

Integrations

Discussion Community

Compare RunbookHermes

Community Notes & Tips Community

Frequently Asked Questions

Hours Saved & ROI Stories Community

Curated lists that include this category