Honcho is a paid tool. No permanent free tier is offered.

Does Honcho have an API?

Yes. Honcho exposes a developer API. See the official documentation at https://honcho.dev for details.

Can I self-host Honcho?

Yes. Honcho supports self-hosting on your own infrastructure.

What platforms does Honcho support?

Honcho is available on: Python and TypeScript SDKs; integrations with Claude Code, OpenCode, Cursor, Hermes Agent, OpenClaw.

Visit Honcho

Get This Tool

License: AGPL-3.0 Commercial ok; derivatives must share license

Local-run terms: Honcho's open-source codebase is licensed under GNU AGPL v3.0, a strong copyleft license: if you self-host as part of a networked service or application, you are required to release the full source code of that application under AGPL as well

Paid Hosted API Official Website

Honcho

FreemiumOpen SourceAPISelf-Hosted

Summary

Most agent memory systems are glorified key-value stores — they retrieve what you wrote, not what it means, and your context window fills with facts the user mentioned once six weeks ago. Honcho is a reasoning layer that sits between your agent and its memory, built to infer patterns and surface the context that actually matters.

Every message written to Honcho triggers automatic reasoning via the vendor's Neuromancer model, which learns user psychology and behavioral patterns rather than just indexing text. The `context()` call returns a curated summary plus conversation history shaped to a token budget you set — the vendor claims 60–90% token reduction versus naive retrieval. Multi-participant sessions model each peer separately, so a group conversation doesn't collapse everyone's state into one blob. The ceiling appears when you need reasoning beyond user memory — Honcho does not run tasks, make decisions, or coordinate agents; it only informs them. Teams building full autonomous pipelines still wire Honcho into a separate orchestration layer.

Bottom line: Pick Honcho for a personal assistant that should get smarter with every conversation; plan a separate execution layer if your agent needs to do more than remember.

Pricing Plans

Usage-BasedLast verified 2 days ago

Free Tier: Standard dreaming included with every workspace

HONCHO MEMORY - Ingestion

per month

Store + Neuromancer reasoning

$2.00/M tokens
context() with no limits
~200ms response time

HONCHO REASONING - Minimal

per month

Basic conclusions u2014 instant

$0.001 per query
Single semantic search
One lookup

HONCHO REASONING - Low

per month

Efficient synthesis u2014 instant

$0.01 per query
Conclusions with surrounding context
Default tier

HONCHO REASONING - Medium

per month

Steerable reasoning u2014 fast

$0.05 per query
Multiple searches
Directed synthesis

HONCHO REASONING - High

per month

Deep synthesis u2014 async

$0.10 per query
Multi-pass analysis
Patterns over time

HONCHO REASONING - Max

per month

Research-grade u2014 async

$0.50 per query
Exhaustive full history search
Quantitative methods

STARTUPS (<$5M RAISED)

Custom

Subsidized startup program

$1,000 in credits
12 months subsidized pricing
Integration support

ENTERPRISE

Custom

Custom plans with dedicated support

Custom pricing
Forward-deployed engineers
Dedicated integration and maintenance support

View full pricing on honcho.dev →

Pricing may have changed since last verified. Check the official site for current plans.

Community Performance Report Card

No community ratings yet. Be the first to rate this tool!

Best For: Teams building personal AI assistants where relationship deepens with use, Applications requiring multi-participant sessions with cross-peer modeling, Projects prioritizing reasoning-first memory over vector similarity search, Developers wanting self-hostable infrastructure with open-source availability

Community Benchmarks Community

No community benchmarks yet. Be the first to share a real-world data point.

Inference Engines & Infra RAG Frameworks

Added on June 9, 2026

Pros

Reasoning-first memory via the Neuromancer model infers behavioral patterns rather than returning raw stored text, so agents stop re-asking questions the user already answered three sessions ago.
Token budget enforcement on `context()` means you get the 10K tokens that matter instead of dumping 100K of history into every prompt, which keeps per-call costs from compounding as conversation history grows.
Multi-peer session modeling keeps each participant's state separate, so a group conversation doesn't corrupt individual user context — something flat key-value stores cannot express at all.
AGPL-3.0 licensing with a self-hosted FastAPI deployment path means teams with data residency requirements can run the full stack on their own infrastructure rather than routing user data through a third-party cloud.
Provider-agnostic design means swapping the underlying LLM for a cheaper or on-premises model is a configuration change, not a migration — protecting the investment when model pricing shifts.

Cons

Honcho is memory infrastructure, not an execution engine — it has no task runner, no branching logic, and no agent coordination. Teams that start with Honcho and then need agents to act on remembered context still build a full orchestration layer on top, at which point Honcho is one dependency among several rather than a standalone solution.
AGPL-3.0 licensing blocks commercial products from embedding Honcho without open-sourcing their own code or negotiating a separate commercial license. Teams building proprietary SaaS that want to bundle memory infrastructure discover this constraint when legal reviews the dependency, and some switch to MIT-licensed alternatives or vendor-specific memory APIs instead.
The deeper `.chat()` reasoning tiers carry per-call cost that scales with usage — for high-volume applications making frequent on-demand reasoning calls, cost modeling must happen before production, not after traffic grows.
Neuromancer, the reasoning model that powers Honcho's memory, is a Plastic Labs proprietary model. Teams that need full auditability of every inference step in memory construction — regulated industries, for instance — cannot inspect or reproduce that reasoning without the vendor's cooperation.

Community Reviews

No reviews yet. Be the first to share your experience.

About

Platforms: Python and TypeScript SDKs; integrations with Claude Code, OpenCode, Cursor, Hermes Agent, OpenClaw
API Available: Yes
Self-Hosted: Yes
Last Updated: 2026-06-09T05:26:26.124Z

Best For

Who it's for

Teams building personal AI assistants where relationship deepens with use
Applications requiring multi-participant sessions with cross-peer modeling
Projects prioritizing reasoning-first memory over vector similarity search
Developers wanting self-hostable infrastructure with open-source availability

What it does well

Building stateful AI agents that remember user preferences and context across sessions
Multi-agent and multi-peer conversations with separate modeling of each participant
Extracting and reasoning about user psychology and behavior patterns from conversations
Reducing token consumption in long-running LLM conversations with intelligent context management
Personalizing agent responses based on accumulated understanding rather than simple retrieval

Integrations

MCPClaude CodeOpenCodeOpenClawHermesCursor-compatible clients

Discussion Community

No discussion yet. Sign in to start the conversation.

Compare Honcho

Spotted incorrect or missing data? Join our community of contributors.

Community Notes & Tips Community

Be the first to contribute. General notes, observations, gotchas, and tips from people who use this tool day-to-day.

Frequently Asked Questions

Is Honcho free?: Honcho is a paid tool. No permanent free tier is offered.
Is Honcho open source?: Yes. Honcho is open source.
Does Honcho have an API?: Yes. Honcho exposes a developer API. See the official documentation at https://honcho.dev for details.
Can I self-host Honcho?: Yes. Honcho supports self-hosting on your own infrastructure.
What platforms does Honcho support?: Honcho is available on: Python and TypeScript SDKs; integrations with Claude Code, OpenCode, Cursor, Hermes Agent, OpenClaw.

Hours Saved & ROI Stories Community

Be the first to contribute. Concrete time/cost savings, with context. e.g. "Cut my code review backlog from 4h to 45m per week."

Curated lists that include this category

Persistent agent memory that degrades into irrelevant context dumps is the problem Honcho is designed around. The core workflow is two steps: write messages to Honcho and call `context()` when your agent needs state. Behind that call, the vendor’s Neuromancer reasoning model has already processed incoming messages, extracted behavioral hypotheses, and ranked what to return within your token budget — the vendor states this happens in approximately 200ms, fast enough to run on every turn. The API is model-agnostic, supporting OpenAI, Anthropic, and custom model backends.

The differentiating bet is reasoning over retrieval. Where vector-search memory systems return chunks that are semantically close to the query, Honcho’s docs describe a system that reasons toward conclusions — patterns across interactions, inferences about user psychology, hypotheses tested against new data. The vendor’s own benchmark page reports state-of-the-art scores on LoCoMo (89.9%), LongMem S (90.4%), and BEAM at context lengths up to 10M tokens, with published evals verifiable on GitHub. The `chat()` method provides on-demand deeper reasoning at tiered cost and latency when a single `context()` call isn’t enough.

Honcho fits cleanly into personal assistant applications, coaching tools, and any system where the relationship between user and agent is meant to deepen with use. Multi-peer session support models each participant separately, which the docs describe as enabling cross-peer reasoning without state collisions. The hard boundary: Honcho provides memory infrastructure, not execution. It does not run workflows, branch on conditions, or coordinate agents autonomously — teams that need those capabilities integrate Honcho as one component in a larger stack.

The tool ships as AGPL-3.0 open source with self-hosted deployment via FastAPI, and a managed cloud API at api.honcho.dev for teams that don’t want to run infrastructure. Pre-built integrations exist for Claude Code (persistent memory plugin), OpenClaw (WhatsApp, Telegram, Discord, Slack gateways), and the NousResearch Hermes agent. A CLI and SDK packages for Python and TypeScript are available.

Get This Tool

Honcho

Summary

Pricing Plans

HONCHO MEMORY - Ingestion

HONCHO REASONING - Minimal

HONCHO REASONING - Low

HONCHO REASONING - Medium

HONCHO REASONING - High

HONCHO REASONING - Max

STARTUPS (<$5M RAISED)

ENTERPRISE

Community Performance Report Card

Community Benchmarks Community

Pros

Cons

Community Reviews

About

Best For

Who it's for

What it does well

Integrations

Discussion Community

Compare Honcho

Community Notes & Tips Community

Frequently Asked Questions

Hours Saved & ROI Stories Community

Curated lists that include this category

AgentRecall

Estran

AGEF

Get This Tool

Share This Tool

Honcho

Summary

Pricing Plans

HONCHO MEMORY - Ingestion

HONCHO REASONING - Minimal

HONCHO REASONING - Low

HONCHO REASONING - Medium

HONCHO REASONING - High

HONCHO REASONING - Max

STARTUPS (<$5M RAISED)

ENTERPRISE

Community Performance Report Card

Community Benchmarks Community

Pros

Cons

Community Reviews

About

Best For

Who it's for

What it does well

Integrations

Discussion Community

Compare Honcho

Community Notes & Tips Community

Frequently Asked Questions

Hours Saved & ROI Stories Community

Curated lists that include this category