Every AI coding agent you use — Claude Code, Codex, Copilot, Gemini — starts each session with amnesia. Yesterday’s debugging breakthrough, that architecture decision from last week, the coding standard your team agreed on three sprints ago — all gone. You either re-explain everything or hope the agent infers it from whatever files happen to be open.

This is the single biggest friction point in agentic workflows today. Not model quality. Not tool integration. Memory.

BHGBrain is our answer: an open-source MCP server that gives your AI agents a persistent, searchable, shared second brain.

The Problem: Every Agent Is a Goldfish

If you’ve used AI coding agents for any real project, you know the pattern:

  1. Morning session: You explain your architecture, constraints, and conventions. The agent does great work.
  2. Afternoon session: New context window. The agent suggests the exact pattern you told it to avoid four hours ago.
  3. Next day: A different agent (maybe Codex for a background task) has zero awareness of what Claude Code learned yesterday.

The workarounds are fragile:

  • MEMORY.md files work until they hit 500 lines and start consuming 15% of your context window on every message.
  • System prompts are static and can’t capture evolving knowledge.
  • Session transcripts are agent-specific and unsearchable.
  • Manually re-explaining is what we were trying to avoid by using agents in the first place.

The fundamental issue: each agent has its own ephemeral context, and none of them talk to each other.

What BHGBrain Does

BHGBrain is an MCP server — meaning any MCP-compatible AI client can connect to it as a tool. It exposes a simple set of operations:

ToolWhat It Does
rememberStore a memory with automatic type classification, deduplication, and tagging
recallSemantic search — find relevant memories by meaning, not keywords
searchHybrid search combining vector similarity and fulltext matching
forgetDelete a memory (with audit trail)
tagAdd or remove tags from memories
categoryManage persistent policy categories (architecture, coding standards, etc.)
collectionsOrganize memories into named collections
backupCreate and restore full backups

Under the hood, every memory gets:

  • Vector embeddings stored in Qdrant for semantic search
  • Metadata and fulltext index in SQLite for fast filtering and keyword search
  • Automatic deduplication — if you tell it the same thing twice, it merges instead of duplicating
  • Type classification — memories are categorized as episodic (events), semantic (facts), or procedural (workflows)

The Write Pipeline

When an agent calls remember, BHGBrain doesn’t just dump text into a database. It runs a multi-phase pipeline:

  1. Extraction — An LLM breaks raw input into atomic memory candidates with inferred types, tags, and importance scores.
  2. Decision — Each candidate is compared against existing memories in the same namespace. The system decides: ADD (new knowledge), UPDATE (refines existing), DELETE (invalidates old info), or NOOP (already known).
  3. Storage — Accepted memories get embedded, indexed, and persisted.

If the extraction model is unavailable, BHGBrain falls back to deterministic deduplication using content hashes and cosine similarity thresholds. It never silently drops a memory.

The Read Path

When an agent needs context, it calls recall or search:

  • Semantic search finds memories by meaning (“how does authentication work in this project?”)
  • Fulltext search finds exact terms (“OIDC federated credentials”)
  • Hybrid search combines both using Reciprocal Rank Fusion (70% semantic, 30% fulltext by default)

There’s also memory://inject — a special MCP resource that delivers a budgeted context block at session start, so agents begin with relevant knowledge without manual prompting.

Why a Shared Brain Changes Everything

The real power isn’t that one agent can remember things. It’s that all your agents share the same memory.

Scenario: Multi-Agent Development

You’re running three agents on a project:

  • Claude Code for interactive development in your IDE
  • Codex for background tasks (test generation, refactoring)
  • OpenClaw for operations and deployment automation

Without shared memory, each agent operates in isolation. Claude Code learns your naming conventions; Codex generates tests that violate them. OpenClaw deploys a config that contradicts an architecture decision Claude Code captured yesterday.

With BHGBrain, when Claude Code stores "All API routes use kebab-case and return RFC 7807 problem details", Codex picks it up in its next recall and generates compliant tests. OpenClaw’s deployment scripts align with the same conventions. One memory, three agents, zero drift.

Scenario: Onboarding and Knowledge Transfer

New team member starts. Instead of reading 40 pages of wiki docs (half outdated), they connect their AI agent to the team’s BHGBrain instance. The agent immediately has access to:

  • Architecture decisions and their rationale
  • Coding standards with concrete examples
  • Known pitfalls and workarounds
  • Project-specific terminology

The bootstrap prompt included with BHGBrain walks through a structured 10-section interview covering identity, responsibilities, goals, tools, entity maps, and operating rules — building a comprehensive work profile in about 30 minutes.

Scenario: Cross-Repository Continuity

Working across multiple repos (common in microservices, monorepo splits, or multi-org consulting)? BHGBrain’s namespace and collection system keeps knowledge organized:

  • Namespace project-alpha holds memories specific to that project
  • Namespace global holds cross-cutting standards
  • Collections within namespaces group related memories (e.g., api-design, infrastructure, security)

Agents query the right scope automatically. No cross-contamination, no lost context.

Architecture: Simple and Self-Hosted

BHGBrain runs on your machine or your infrastructure. There’s no cloud dependency beyond the embedding API (and even that is optional with local models).

MCP Clients (Claude / Codex / OpenClaw / etc.)
  → MCP transport (HTTP or stdio)
  → BHGBrain server
      → Write pipeline (extraction + dedup + decision)
      → Qdrant (vector search)
      → SQLite (metadata, fulltext, categories, audit log)

Requirements:

  • Node.js 20+
  • Qdrant (Docker one-liner: docker run -d --name qdrant -p 6333:6333 qdrant/qdrant)
  • OpenAI API key (for embeddings) — or use Ollama with nomic-embed-text for fully local operation

Install and run:

git clone https://github.com/Big-Hat-Group-Inc/BHGBrain.git
cd BHGBrain
npm install && npm run build
export OPENAI_API_KEY=sk-...
export BHGBRAIN_TOKEN=$(node -e "console.log(require('crypto').randomBytes(32).toString('hex'))")
node dist/index.js

That’s it. Your agents can now connect via stdio (local) or HTTP (networked).

Enterprise-Ready by Default

BHGBrain isn’t a toy. It’s built with production concerns from day one:

  • Authentication: Bearer token required for non-loopback HTTP connections. Fail-closed — if the token env var isn’t set and you bind to a non-loopback address, the server refuses to start.
  • Rate limiting: 100 requests/minute/client by default.
  • Audit logging: Every write and delete is logged with timestamp, namespace, client ID, and operation type.
  • Secret scanning: Memories are checked for credential patterns before storage. Likely secrets are rejected.
  • Backup and restore: Full SQLite + Qdrant snapshots with integrity verification.
  • Graceful degradation: If Qdrant goes down, reads fall back to SQLite fulltext. If the embedding API is unavailable, the server enters degraded mode instead of crashing.
  • Structured logging: JSON logs with automatic token and content redaction.

Getting Started in 5 Minutes

1. Start Qdrant

docker run -d --name qdrant -p 6333:6333 qdrant/qdrant

2. Install BHGBrain

git clone https://github.com/Big-Hat-Group-Inc/BHGBrain.git
cd BHGBrain && npm install && npm run build

3. Configure Your Agent

For Claude Desktop, add to claude_desktop_config.json:

{
  "mcpServers": {
    "bhgbrain": {
      "command": "node",
      "args": ["/path/to/BHGBrain/dist/index.js"],
      "env": { "OPENAI_API_KEY": "sk-..." }
    }
  }
}

For HTTP clients (OpenClaw, mcporter, remote agents):

{
  "mcpServers": {
    "bhgbrain": {
      "transport": "http",
      "url": "http://127.0.0.1:3721",
      "headers": { "Authorization": "Bearer YOUR_TOKEN" }
    }
  }
}

4. Start Remembering

Ask your agent to remember something:

“Remember: our API uses kebab-case routes, returns RFC 7807 problem details, and all endpoints require Bearer token authentication.”

BHGBrain stores it, classifies it as semantic, tags it appropriately, and makes it available to every connected agent via recall.

5. Bootstrap Your Brain

Run the included bootstrap interview to build a comprehensive work profile:

# Paste the contents of BootstrapPrompt.txt into a fresh AI conversation
# The agent will interview you across 10 sections and produce a structured profile

CLI for Power Users

BHGBrain includes a full CLI for direct management:

bhgbrain search "authentication patterns" --mode hybrid
bhgbrain list --limit 20
bhgbrain category set "Coding Standards" --file ./standards.md
bhgbrain stats
bhgbrain gc --consolidate    # Merge similar memories, flag stale ones
bhgbrain backup create

When to Use BHGBrain vs. MEMORY.md

MEMORY.mdBHGBrain
Scale~100 memories before context bloat500,000+ memories
SearchFull file loaded every sessionSemantic retrieval of relevant subset
Multi-agentPer-agent, per-workspaceShared across all MCP clients
DeduplicationManualAutomatic (hash + cosine similarity)
TypesFlat textEpisodic, semantic, procedural
AuditNoneFull audit log
BackupGitDedicated backup/restore with integrity checks

MEMORY.md is fine for a single agent on a single project with a handful of notes. BHGBrain is for teams, multi-agent workflows, and anyone whose AI memory needs have outgrown a text file.

What’s Next

BHGBrain v1 focuses on getting the core right: reliable storage, smart deduplication, hybrid search, and multi-client access. The roadmap includes:

  • Multi-user RBAC — team-level access control
  • Encryption at rest — for regulated environments
  • Cloud sync — optional synchronization across machines
  • Working memory TTL — short-lived scratch memories that auto-expire

Try It

BHGBrain is open source under the MIT license.

GitHub: github.com/Big-Hat-Group-Inc/BHGBrain

If your AI agents keep forgetting what you told them yesterday, give them a brain that lasts.


Kevin Kaminski is the founder of Big Hat Group, where we build tools and consulting practices around enterprise AI, Windows 365, and cloud infrastructure. BHGBrain grew out of our own frustration with agent amnesia across the multi-agent workflows we run daily.