BHGBrain is an open-source MCP server that gives AI agents persistent, vector-backed memory. It stores memories in SQLite and Qdrant, making them searchable across sessions and tools via semantic, fulltext, or hybrid search.

Which AI agents work with BHGBrain?

Any MCP-compatible client — Claude Code, Claude Desktop, OpenAI Codex, OpenClaw, Gemini, and others. BHGBrain supports both stdio and HTTP transports.

Yes. BHGBrain is open source under the MIT license. It runs on your own infrastructure with no cloud dependency beyond an optional embedding API.

Do I need special hardware to run BHGBrain?

No. BHGBrain runs on any machine with Node.js 20+ and a Qdrant instance (one Docker command). Embeddings use OpenAI's API by default, or you can use a local model like nomic-embed-text via Ollama for fully offline operation.

Can multiple AI agents share the same BHGBrain instance?

Yes. Multiple MCP clients connect to BHGBrain over HTTP, sharing a single knowledge base. Namespace isolation prevents cross-project leakage while enabling cross-agent knowledge sharing within a namespace.

BHGBrain: Persistent Memory for AI Agents

Every AI agent starts each session with amnesia. Yesterday’s architecture decision, last week’s debugging breakthrough, the naming convention your team agreed on three sprints ago — gone. You re-explain it, or the agent gets it wrong.

BHGBrain fixes this. It’s an open-source MCP server that gives your AI agents a persistent, searchable, shared memory that survives across sessions, tools, and teams.

How It Works

BHGBrain sits between your AI agents and a durable storage layer. Any MCP-compatible client — Claude Code, Codex, OpenClaw, Gemini — connects to BHGBrain and gains access to a shared knowledge base.

AI Agents (Claude / Codex / OpenClaw / Gemini)
  → MCP transport (stdio or HTTP)
    → BHGBrain server
      → Qdrant (semantic vector search)
      → SQLite (metadata, fulltext index, audit log)

When an agent stores a memory, BHGBrain runs it through an intelligent pipeline:

Extraction — Breaks input into atomic memory candidates with inferred types, tags, and importance
Deduplication — Compares against existing memories using content hashing and cosine similarity
Decision — Determines whether to add new knowledge, update existing entries, or discard duplicates
Storage — Embeds, indexes, and persists accepted memories

When an agent needs context, BHGBrain delivers it through hybrid search — combining semantic similarity (70%) and fulltext matching (30%) via Reciprocal Rank Fusion. Agents get the relevant memories, not the entire knowledge base.

Key Capabilities

Shared memory across agents — Claude Code learns your API conventions; Codex picks them up automatically in the next session. One memory, every agent, zero drift.
Hybrid search — Semantic search finds memories by meaning. Fulltext search finds exact terms. Hybrid combines both for the best of each.
Smart deduplication — Tell it the same thing twice and it merges instead of duplicating. Update a fact and the old version is replaced.
Memory classification — Memories are automatically typed as episodic (events), semantic (facts), or procedural (workflows).
Namespace isolation — Separate projects, teams, or clients without cross-contamination. Global namespaces for cross-cutting standards.
Collections — Group related memories within namespaces (e.g., api-design, infrastructure, security).
Context injection — A special MCP resource delivers a budgeted context block at session start, so agents begin with relevant knowledge without manual prompting.
Full CLI — List, search, manage categories, run garbage collection, create backups — all from the command line.

Enterprise-Ready by Default

BHGBrain isn’t a prototype. It’s built for production use from day one.

Capability	Detail
Authentication	Bearer token required for non-loopback HTTP. Fail-closed — server refuses to start without credentials on external bindings.
Audit logging	Every write and delete logged with timestamp, namespace, client ID, and operation type.
Secret scanning	Memories checked for credential patterns before storage. Likely secrets are rejected.
Rate limiting	100 requests/minute/client by default.
Graceful degradation	If Qdrant goes down, reads fall back to SQLite fulltext. If embeddings are unavailable, server enters degraded mode instead of crashing.
Backup and restore	Full SQLite + Qdrant snapshots with integrity verification.

Who It’s For

Teams running multi-agent workflows — When Claude Code, Codex, and OpenClaw all need to share the same project knowledge without drift.
Enterprise IT departments — Organizations that need audit trails, authentication, and self-hosted infrastructure for AI memory.
Consultants and agencies — Namespace isolation keeps client knowledge separate while sharing internal standards across engagements.
Solo developers — Anyone whose AI memory needs have outgrown a MEMORY.md file.

Get Started in 5 Minutes

1. Start Qdrant

docker run -d --name qdrant -p 6333:6333 qdrant/qdrant

2. Install BHGBrain

git clone https://github.com/Big-Hat-Group-Inc/BHGBrain.git
cd BHGBrain && npm install && npm run build

3. Set your API key and run

export OPENAI_API_KEY=sk-...
node dist/index.js

4. Connect your agent

Add BHGBrain to your MCP client config — Claude Desktop, OpenClaw, or any MCP-compatible tool. Your agents can now remember and recall across sessions.

For detailed setup, configuration options, and the bootstrap interview prompt, see the full documentation on GitHub.

📦 GitHub: github.com/Big-Hat-Group-Inc/BHGBrain

📖 Deep dive: BHGBrain: Give Your AI Agents a Shared, Persistent Memory