Skip to main content
AIDiveForge AIDiveForge
Visit Supermemory

Get This Tool

License: MIT Any use incl. commercial
Local-run terms: The MIT-licensed MCP server and agent plugins can be self-hosted and run locally. However, the core memory engine and managed API are proprietary; full self-hosting of the complete system requires an enterprise agreement. Users can run the open-source MCP server component against a Supermemory API key for local development.

Share This Tool

Compare This Tool
📋 Embed this tool on your site

Copy this code to embed a compact tool card:

Supermemory

FreemiumOpen SourceAPIAgentic

Summary

Every new conversation your agent starts from scratch — no memory of what the user prefers, corrected last time, or asked three sessions ago. Supermemory is the context infrastructure layer that fixes that.

Supermemory wraps memory, retrieval, user profiling, data connectors, and document extraction into one API so your agent doesn't reassemble context from scratch on every request. The retrieval layer claims sub-300ms latency using hybrid search with reranking, and the memory layer maintains a knowledge graph that merges contradictions and evolves facts over time rather than appending chunks blindly. Connectors to Slack, Notion, Drive, Gmail, GitHub, and S3 sync automatically — no ETL pipeline to maintain. The core memory engine is proprietary and hosted-only; self-hosting requires an enterprise agreement, so teams with strict data residency requirements hit a wall before they ship.

Bottom line: Pick this when you need persistent, evolving user profiles and cross-session memory without building the retrieval infrastructure yourself — but plan a different path if your compliance posture requires the memory engine on your own infrastructure.

Hosted & API Pricing

The model is free to self-host. These are the creator's hosted/API options.

Memory

via Supermemory
$0.01/1K SM tokens

Memory graph per user. Auto profiles and fact hierarchies so agents learn in real-time.

  • Plain text: $0.005
  • Rich content: $0.010
  • 2u00d7 cheaper than next-best, with better quality

SuperRAG

via Supermemory
$0.00/1K SM tokens

Multimodal Extraction -> Contextual Chunking -> Retrieval for agents. No embeddings or vectors required.

  • Text mode: $0.001
  • Rich mode: $0.002
  • Available as a filesystem

Search and Traversal

via Supermemory
$0.01/1K queries

Insanely cheap semantic search and graph traversal against your content.

  • Hybrid search u2014 RAG+Memory in one call
  • Graph traversal across linked memories
  • Configurable filters and re-ranking
  • Sub-300ms p50

Operations

via Supermemory
$0.10/1K operations

Additional operations for API calls

  • Re-ranking
  • Aggregation
  • Query rewriting
  • Other operations

Pricing may have changed since last verified. Check the official site for current plans.

Pricing Plans

SubscriptionLast verified 2 days ago
Price
$0 - $399+/mo
Free Tier
$0/mo with $5/mo of usage built in

Free

Free

For builders tinkering, prototypes and sideprojects.

  • $5/mo of usage built in
  • Hermes Plugin
  • Supermemory MCP
  • Community support

Max

$100per month

More headroom for developers who need it.

  • ~$130/mo of usage built in (6u00d7 Pro)
  • Unlimited storage
  • Unlimited users
  • Gmail connector
  • Granola connector
  • Auto top-up available
  • OpenClaw, Claude Code and other plugins
  • Priority support

Scale

$399per month

For teams running production workloads.

  • ~$600/mo of usage built in
  • Unlimited storage
  • Unlimited users
  • Up to 10 teammates
  • All connectors (Gmail, GitHub, S3, Web Crawler + Pro)
  • Auto top-up + spend caps
  • Priority support
  • SOC 2
  • HIPAA BAA
  • Self-hosted option

Enterprise

Custom

For organizations with committed spend, custom deployments, and security requirements.

  • Custom pricing
  • Air-gapped self-hosting
  • Dedicated managed instance
  • SOC 2
  • HIPAA
  • GDPR
  • Custom contracts & DPA
  • Unlimited usage
  • Dedicated account manager
  • Forward Deployed engineer
  • 1:1 onboarding & integration
  • Uptime SLA
  • Priority Slack channel

View full pricing on supermemory.ai →

Pricing may have changed since last verified. Check the official site for current plans.

Community Performance Report Card

No community ratings yet. Be the first to rate this tool!

Best For: Development teams building production AI agents, Organizations needing low-latency retrieval with personalization, Teams integrating multiple data sources (Notion, Drive, GitHub), Applications requiring user profile continuity across conversations

Community Benchmarks Community

No community benchmarks yet. Be the first to share a real-world data point.

  • Knowledge graph memory that merges and contradicts facts across sessions, which means your agent doesn't tell a user something they already corrected two conversations ago.
  • Sub-300ms hybrid search with reranking baked into the retrieval layer, so you avoid building and tuning a separate retrieval pipeline to hit production latency targets.
  • Persistent user profiles that carry preference, behavior, and identity context across sessions, which means a support agent or personalized chatbot doesn't reset its understanding of the user on every ticket.
  • Real-time connectors to Slack, Notion, Drive, Gmail, GitHub, and S3 with automatic sync, so your agent's memory reflects live changes in the tools your users actually work in — no manual import jobs to maintain.
  • Multi-format extraction for PDFs, web pages, images, and audio consolidated into one provider, which means you don't wire together separate parsing services before you can ingest mixed document types.
  • The core memory engine is not self-hostable without an enterprise agreement — teams with data residency requirements or strict policies against sending user memory to a third-party managed service cannot deploy this in production without negotiating a contract first, and most either wait on procurement or replace the memory layer with a self-managed vector store.
  • The knowledge graph and memory update logic are proprietary and closed; when retrieval behaves unexpectedly — returning stale facts or failing to surface a contradiction — there is no source code to inspect. Teams debugging production retrieval issues work from API responses and vendor support, not from the system itself.
  • The free tier is capped at defined token and query limits, meaning a team validating the tool at scale will exhaust the free tier before they have enough production data to make a confident architecture decision — at which point cost exposure begins before the build is complete.
  • Agent frameworks that manage their own memory or context windows require explicit integration work to hand off to Supermemory rather than their native store; teams already deep in a framework with memory primitives — LangGraph, for example — often find the integration layer adds complexity that exceeds the benefit for their specific architecture and abandon Supermemory in favor of the framework's native memory tooling.

Community Reviews

No reviews yet. Be the first to share your experience.

About

Platforms
Cloud-hosted (SaaS); MCP server; Browser plugins (Chrome); IDE integrations (Claude Code, Cursor, VS Code)
API Available
Yes
Self-Hosted
No
Last Updated
2026-06-09T06:08:35.038Z

Best For

Who it's for

  • Development teams building production AI agents
  • Organizations needing low-latency retrieval with personalization
  • Teams integrating multiple data sources (Notion, Drive, GitHub)
  • Applications requiring user profile continuity across conversations

What it does well

  • AI assistants with persistent user context and preference memory
  • Personalized chatbots and customer support agents that remember interaction history
  • Knowledge base RAG with user-specific context injection
  • Coding assistants with memory of developer patterns and codebase context
  • Multi-session AI agents that learn and adapt over time

Integrations

SlackNotionGoogle DriveGmailGitHubS3custom sources; Claude CodeCursorOpenCodeHermesCodex; Vercel AI SDKLangChainMastra

Discussion Community

No discussion yet. Sign in to start the conversation.

Spotted incorrect or missing data? Join our community of contributors.

Sign Up to Contribute

Community Notes & Tips Community

Be the first to contribute. General notes, observations, gotchas, and tips from people who use this tool day-to-day.

Frequently Asked Questions

Is Supermemory free?
Supermemory is a paid tool ($0 - $399+/mo). No permanent free tier is offered.
Is Supermemory open source?
Yes. Supermemory is open source.
Does Supermemory have an API?
Yes. Supermemory exposes a developer API. See the official documentation at https://supermemory.ai for details.
When was Supermemory released?
Supermemory was first released in 2024.
What platforms does Supermemory support?
Supermemory is available on: Cloud-hosted (SaaS); MCP server; Browser plugins (Chrome); IDE integrations (Claude Code, Cursor, VS Code).

Hours Saved & ROI Stories Community

Be the first to contribute. Concrete time/cost savings, with context. e.g. "Cut my code review backlog from 4h to 45m per week."

Supermemory

Most RAG systems store chunks and return chunks. Each session begins cold, with no knowledge of who the user is, what they corrected last time, or which preferences have shifted. Supermemory replaces that append-only pattern with a knowledge graph that merges, contradicts, and forgets facts across sessions — the vendor describes this as ‘continual learning’ rather than retrieval. The core workflow is a single API that handles ingestion, retrieval, and memory updates together, with TypeScript and Python SDKs and an npx setup command the docs surface as the entry point.

The differentiating feature is the Profiles primitive: preference, behavior, and identity context that persists across sessions so the agent knows the person, not just the current conversation thread. Paired with the Filesystem primitive — a POSIX-mountable memory layer where grep becomes semantic search and any file is indexed automatically — agents built on Supermemory can access context the way a developer accesses a filesystem rather than constructing retrieval queries by hand.

Supermemory fits teams building production agents that serve returning users: personalized support bots, coding assistants that learn a developer’s patterns, or RAG applications that need user-specific context injected at retrieval time. It does not fit organizations that cannot send user memory data to a managed third-party service. The core memory engine is proprietary and cloud-hosted; the open-source components are the MCP server and plugins, released under MIT. Self-hosting the full engine requires an enterprise agreement — there is no self-hosted community path. Teams that hit that wall typically shift to a combination of a self-hosted vector database and a custom memory management layer, which means rebuilding what Supermemory abstracts.

The Connectors layer syncs Slack, Notion, Drive, Gmail, GitHub, and S3 in real time — the vendor states changes in connected tools are reflected in agent memory without manual imports. The Extractors layer handles PDFs, web pages, images, and audio, with chunking described as meaning-preserving across document boundaries rather than character-count splits. The vendor reports SOC 2 Type II certification and claims throughput exceeding 100 billion tokens per month across the platform.