BHGBrain is an open-source MCP server that gives AI agents persistent, vector-backed memory. It stores memories in SQLite and Qdrant, making them searchable across sessions and tools via semantic, fulltext, or hybrid RRF search. It supports tiered retention (T0–T3) so memories expire on schedules matched to their value.

Which AI agents work with BHGBrain?

Any MCP-compatible client — Claude Code, Claude Desktop, OpenAI Codex, OpenClaw, Gemini, and others. BHGBrain supports both stdio and HTTP transports.

Yes. BHGBrain is open source under the MIT license. It runs on your own infrastructure with no cloud dependency beyond an optional embedding API.

Do I need special hardware to run BHGBrain?

No. BHGBrain runs on any machine with Node.js 20+ and a Qdrant instance (one Docker command). Embeddings use OpenAI's API by default, or you can use a local model like nomic-embed-text via Ollama for fully offline operation.

Can multiple AI agents share the same BHGBrain instance?

Yes. Multiple MCP clients connect to BHGBrain over HTTP, sharing a single knowledge base. Namespace isolation prevents cross-project leakage while enabling cross-agent knowledge sharing within a namespace.

What is tiered retention in BHGBrain?

BHGBrain organizes memories into four retention tiers: T0 (foundation, never expires), T1 (institutional, 1-year TTL), T2 (operational, 90-day TTL), and T3 (ephemeral, 30-day TTL). Each access resets the expiry clock via sliding window TTL. Frequently accessed memories auto-promote to longer-lived tiers.

Is BHGBrain documentation available in other languages?

Yes. Full documentation is available in English, Mandarin (中文), German (Deutsch), French (Français), and Spanish (Español) on GitHub.

BHGBrain: Persistent Memory for AI Agents

Every AI agent starts each session with amnesia. Yesterday’s architecture decision, last week’s debugging breakthrough, the naming convention your team agreed on three sprints ago — gone. You re-explain it, or the agent gets it wrong.

BHGBrain fixes this. It’s an open-source MCP server that gives your AI agents a persistent, searchable, shared memory that survives across sessions, tools, and teams.

How It Works

BHGBrain sits between your AI agents and a durable storage layer. Any MCP-compatible client — Claude Code, Codex, OpenClaw, Gemini — connects to BHGBrain and gains access to a shared knowledge base.

AI Agents (Claude / Codex / OpenClaw / Gemini)
  → MCP transport (stdio or HTTP)
    → BHGBrain server
      → Qdrant (semantic vector search)
      → SQLite (metadata, fulltext index, audit log, archive)

When an agent stores a memory, BHGBrain runs it through an intelligent pipeline:

Normalization — Input is cleaned and standardized before hashing, improving deduplication accuracy across paraphrased or reformatted content
Deduplication — Compares against existing memories using SHA-256 content hashing and cosine similarity (threshold: 0.92), with tier-adjusted thresholds for precision control
Decision — Determines whether to add new knowledge, update existing entries, or discard duplicates
Retention assignment — Assigns the memory to the appropriate retention tier (T0–T3) based on type, importance, and caller-specified preference
Storage — Embeds, indexes, and persists accepted memories with importance scores that influence future search ranking

When an agent needs context, BHGBrain delivers it through hybrid RRF search — combining semantic similarity (70%) and fulltext matching (30%) via Reciprocal Rank Fusion, with configurable weights. Agents get the relevant memories, not the entire knowledge base.

Key Capabilities

Tiered retention (T0–T3) — Memories live as long as they matter. T0 (foundation) never expires. T1 (institutional) lasts one year. T2 (operational) lasts 90 days. T3 (ephemeral) lasts 30 days. Each tier has configurable capacity budgets (T1: 100K, T2: 200K, T3: 200K entries).
Sliding window TTL — Every access resets the expiry clock. A memory that keeps getting used stays alive automatically.
Auto-promotion — Memories accessed 5+ times automatically promote to the next higher retention tier. High-value knowledge self-selects for permanence.
Pre-expiry warnings — Memories are flagged 7 days before expiration, giving agents and operators time to act before knowledge is lost.
Archive-before-delete — Expired memories are written to an archive table before removal. Nothing is permanently purged without a recoverable record.
Hybrid RRF search — Reciprocal Rank Fusion combines semantic (70%) and fulltext (30%) results into a single ranked list. Weights are configurable per query.
Semantic deduplication — Cosine similarity at 0.92 threshold catches near-duplicates. SHA-256 checksums catch exact ones. Content normalization runs before hashing so paraphrased inputs deduplicate correctly.
Importance scoring — Each memory carries a 0–1 importance score that directly influences search result ranking. High-importance memories surface first.
Categories / persistent policy slots — Named policy categories (e.g., architecture-decisions, coding-standards, security-policies) provide persistent slots for institutional knowledge that should always be available, independent of TTL.
Shared memory across agents — Claude Code learns your API conventions; Codex picks them up automatically in the next session. One memory, every agent, zero drift.
Memory classification — Memories are automatically typed as episodic (events), semantic (facts), or procedural (workflows).
Namespace isolation — Separate projects, teams, or clients without cross-contamination. Global namespaces for cross-cutting standards.
Collections — Group related memories within namespaces (e.g., api-design, infrastructure, security).
Context injection — A special MCP resource delivers a budgeted context block at session start, so agents begin with relevant knowledge without manual prompting.
Full CLI — List, search, manage categories, run garbage collection, create backups — all from the command line.

Enterprise-Ready by Default

BHGBrain isn’t a prototype. It’s built for production use from day one.

Capability	Detail
Authentication	Bearer token required for non-loopback HTTP. Fail-closed — server refuses to start without credentials on external bindings.
Audit logging	Every write and delete logged with timestamp, namespace, client ID, and operation type.
Secret scanning	Memories checked for credential patterns before storage. Likely secrets are rejected.
Rate limiting	100 requests/minute/client by default.
Graceful degradation	If Qdrant goes down, reads fall back to SQLite fulltext. If embeddings are unavailable, server enters degraded mode instead of crashing.
Backup and restore	Full SQLite + Qdrant snapshots with integrity verification.
Capacity budgets	Per-tier entry limits (T1: 100K, T2: 200K, T3: 200K) prevent unbounded growth and keep storage predictable.
Pre-expiry warnings	Memories flagged 7 days before TTL expiration for review or re-promotion.
Archive-before-delete	Expired entries written to archive table before removal — no silent data loss.

Who It’s For

Teams running multi-agent workflows — When Claude Code, Codex, and OpenClaw all need to share the same project knowledge without drift.
Enterprise IT departments — Organizations that need audit trails, authentication, and self-hosted infrastructure for AI memory.
Consultants and agencies — Namespace isolation keeps client knowledge separate while sharing internal standards across engagements.
Solo developers — Anyone whose AI memory needs have outgrown a MEMORY.md file.

Get Started in 5 Minutes

1. Start Qdrant

docker run -d --name qdrant -p 6333:6333 qdrant/qdrant

2. Install BHGBrain

git clone https://github.com/Big-Hat-Group-Inc/BHGBrain.git
cd BHGBrain && npm install && npm run build

3. Set your API key and run

export OPENAI_API_KEY=sk-...
node dist/index.js

4. Connect your agent

Add BHGBrain to your MCP client config — Claude Desktop, OpenClaw, or any MCP-compatible tool. Your agents can now remember and recall across sessions.

For detailed setup, configuration options, and the bootstrap interview prompt, see the full documentation on GitHub.

Multilingual Documentation

BHGBrain ships with full documentation in five languages:

Language	README
English	github.com/Big-Hat-Group-Inc/BHGBrain
中文 (Mandarin)	README.zh-CN.md
Deutsch	README.de.md
Français	README.fr.md
Español	README.es.md

📦 GitHub: github.com/Big-Hat-Group-Inc/BHGBrain

📖 Deep dive: BHGBrain: Give Your AI Agents a Shared, Persistent Memory