14 catalogued.
Fastest setup + largest integration ecosystem; dual session/user scope.
Self-host: moderateFreemiumApache-2.0
Best for: Fastest drop-in memory with the biggest integration ecosystem · Token-cost-sensitive production agents (single-pass extraction, sub-7k tokens/call) · AWS Agent SDK users and SOC 2 / HIPAA workloads
View memory card →Production polish: sub-300ms, SOC2/HIPAA, connectors, context fencing.
Managed onlyFreemiumOSS*
Best for: Polished managed memory API with SOC 2 / HIPAA compliance · Coding-agent memory via MCP (Claude Code / OpenCode plugins) · One API over mixed data (files, email, PDFs, chat)
View memory card →Hybrid BM25 + vector + rerank → exact-token recall semantic search misses.
Managed onlyPaidOSS*
Best for: Agents needing lossless recall — full chronology, no semantic-search step · Codes / IDs / error strings and preference-recall-heavy apps
View memory card →Memory in plain SQL — no vector DB, fully inspectable, portable.
Self-host: trivialFree + paidApache-2.0
Best for: Skip the vector DB — inspectable SQL
View memory card →A self-hostable 'memory operating system' that packages long-term memory into MemCube units and manages their lifecycle (store / retrieve / update / schedule) outside the model.
Self-host: moderateFree / OSSApache-2.0
Best for: Teams wanting a self-hosted memory layer with hybrid retrieval and skill reuse
View memory card →An OS-inspired memory layer for personalized AI agents that organizes user memory into short-, mid-, and long-term tiers and migrates entries between them. Published as an EMNLP 2025 Oral.
Self-host: moderateFree / OSSApache-2.0
Best for: Personalized conversational agents needing tiered long-term user memory
View memory card →A modular multi-agent memory system that augments any LLM. Specialized agents manage six memory types (Core, Episodic, Semantic, Procedural, Resource, Knowledge Vault) under a coordinator that orchestrates updates and retrieval. Ships a desktop app that builds a personal memory base from on-screen activity.
Self-host: moderateFree / OSSApache-2.0
Best for: Personal assistants needing multimodal, screen-aware long-term memory
View memory card →TencentDB Agent Memory
Tencent
Fully-local long-term memory for AI agents built on two pillars: layered long-term memory (a semantic pyramid L0 Conversation → L1 Atom → L2 Scenario → L3 Persona) and symbolic short-term memory that offloads verbose tool logs to files while keeping a compact Mermaid 'canvas' in context. Distributed as a TypeScript/npm package; integrates with OpenClaw and the Hermes gateway.
Self-host: moderateFree / OSSMIT
Best for: Long-horizon agent tasks needing token-efficient, fully-local memory with traceable layered recall
View memory card →A lifelong memory stack for LLM agents built on 'semantically lossless compression' — store dense, high-information memory so an agent recalls more while spending far fewer tokens. Ships as one `simplemem` Python package that auto-routes across three pillars: SimpleMem (text efficiency core), Omni-SimpleMem (multimodal: text/image/audio/video), and EvolveMem (self-evolving retrieval). Also offered as a cloud-hosted and self-hostable MCP server. Backed by arXiv papers (2601.02553, 2604.01007, 2605.13941).
Self-host: moderateFree + paidMIT
Best for: Token-budget-constrained agents needing dense lifelong memory with intent-aware retrieval, optionally across modalities
View memory card →An agent memory management layer positioned as a high-performance drop-in replacement for Mem0 (`import telemem as mem0`), optimized for multi-turn dialogue, character modeling, long-term storage, and semantic retrieval. Pipeline: character-aware summarization → semantic-clustering deduplication → efficient storage → precise retrieval. Extends to multimodal video memory (frame extraction → captioning → vector DB) with ReAct-style multi-step video QA. Backed by a tech report (arXiv 2601.06037).
Self-host: moderateFree / OSSApache-2.0
Best for: Teams wanting a local, Mem0-compatible memory layer with strong per-character isolation and optional video memory
View memory card →A self-hosted MCP server (plus REST API) that adds persistent, personalized long-term memory to any MCP-compatible assistant (Claude Code, ChatGPT, Cursor, Open WebUI, and more). A single unified LLM call performs fact extraction, metadata classification, deduplication, and contradiction resolution at once. Two-tier design: fast searchable summaries in a vector store, plus a detailed artifact store retrieved on demand.
Self-host: moderateFree / OSSApache-2.0
Best for: Self-hosters wanting a private, MCP-native memory server with automatic fact extraction, dedup, and contradiction handling
View memory card →archon-memory-core
Divergence Router
An in-process, local-first Python memory library (`pip install archon-memory-core`) whose thesis is that memory should get better the longer it is used. Built on ChromaDB + Ollama, it pairs ranked top-1 retrieval with supersede-aware nightly consolidation, type-aware salience, an entity graph, active forgetting, and full replay/observability. Positions itself as a memory policy library (not an agent runtime), with LangChain and LlamaIndex adapters.
Self-host: moderateFree / OSSApache-2.0
Best for: Agents that accumulate contradictory facts over long horizons and want a local, consolidating memory library with built-in forgetting
View memory card →PowerMem
OceanBase / ob-labs
Persistent, self-evolving AI memory plugin for coding agents and applications. Combines LLM-driven memory extraction with a two-layer Experience + Skill distillation system: raw interactions are first compressed into Experience memories, then recurring patterns are further abstracted into reusable Skill entries. Ebbinghaus-style time-decay keeps memory collections pruned and relevant over time. Exposes a unified backend via Python SDK, HTTP REST server, MCP server, and CLI.
Self-host: moderateFree / OSSApache-2.0
Best for: AI coding agents and multi-agent systems that need both factual recall and reusable procedural workflows distilled from past sessions · Teams wanting a production-ready memory backend that spans multiple agent clients (Claude Code, Codex, OpenCode, Cline) via a shared server
View memory card →Self-organising long-term memory substrate for agentic LLM workflows, grounded in Event Segmentation Theory (EST) and Predictive Processing (PP). Ingests multi-turn conversations, segments them into topically coherent episodes via LLM-powered boundary detection, distils durable semantic knowledge from each episode, and exposes a unified search surface for downstream reasoning. Designed as a minimalist production-ready core: PostgreSQL for structured metadata, Qdrant for vector similarity.
Self-host: moderateFree / OSSMIT
Best for: Agentic LLM workflows needing structured long-term memory with semantically coherent episodes and a unified search surface across episodic and semantic stores
View memory card →