Claude Integration

Memory System

How Cognova extracts, stores, scores, and injects memories so Claude retains context across sessions.

Why Memory Matters

Claude is stateless. Every session starts with a blank context window. Without memory, Claude would re-discover the same facts, re-ask the same questions, and forget every decision you have made together.

Cognova solves this with a persistent memory layer that automatically captures important information from conversations and injects it back at the start of each session. The result is an agent that genuinely knows you over time.

Memory Types

Each memory is categorized by type, which determines how it is displayed and prioritized.

TypeIconUse CaseExample
decision[D]Choices about architecture, tools, or approach"Using Caddy instead of Nginx for reverse proxy"
fact[F]Important information about the environment"Deploy script requires sudo on Linux"
solution[S]How a problem was solved"Fixed CORS by adding origin to Nitro route rules"
pattern[P]Recurring conventions"Server imports use ~~/server/ alias, not relative paths"
preference[*]User preferences for style, tools, workflows"User prefers pnpm over npm or yarn"
summary[~]Session summaries and high-level context"Completed vault sync feature with file watcher"

How Memories Are Created

Memories enter the system through three paths.

1. Automatic Extraction (Hooks)

The stop-extract and pre-compact hooks send conversation transcripts to the memory extraction API. The server runs a separate Claude instance with a specialized prompt that identifies memories worth preserving.

The extraction prompt instructs Claude to:

  • Extract only genuinely useful information
  • Skip routine acknowledgments, greetings, and debugging steps
  • Keep content concise (max 100 characters)
  • Assign a relevance score from 0.0 to 1.0
  • Return an empty array if nothing is worth extracting

The extractor processes the last 20 messages of the transcript (up to 8,000 characters) and returns structured JSON:

[
  { "type": "decision", "content": "Using Drizzle ORM instead of Prisma", "relevance": 0.9 },
  { "type": "preference", "content": "No semicolons in TypeScript", "relevance": 0.8 }
]

2. Manual Storage (Skill)

Claude stores memories explicitly using the /memory store command when the user shares something important:

/memory store --type preference "User is a fullstack developer focused on TypeScript"
/memory store --type decision "All database migrations use Drizzle Kit"
/memory store --type fact "Production server is Ubuntu 24.04 on Hetzner"

Claude is instructed to store memories immediately when the user reveals preferences, makes decisions, solves problems, or shares facts about their environment.

3. Onboarding

On the first session (when no memories exist), Claude runs an onboarding flow -- asking about the user's role, tech stack, goals, and preferences. Each response is stored as a separate memory and a ## User Profile section is appended to CLAUDE.md as a fallback.

How Memories Are Injected

At session start, the session-start hook calls GET /api/memory/context, which:

  1. Queries the memory_chunks table ordered by relevance score (descending), then by creation date (descending)
  2. Returns the top N memories (default: 5, max: 20)
  3. Increments the access_count and updates last_accessed_at for each returned memory
  4. Groups memories by type and formats them as markdown

The formatted output is printed to stdout, which Claude receives as injected context:

5 memories loaded:

### Preferences
- User prefers Tailwind CSS with the default color palette
- No semicolons in TypeScript

### Decisions
- Using PostgreSQL for all persistent data
- Caddy for reverse proxy with automatic HTTPS

### Key Facts
- Production runs on Ubuntu 24.04 at Hetzner

Relevance and Scoring

Every memory has a relevanceScore between 0.0 and 1.0:

ScoreMeaning
0.9 - 1.0Critical -- core user preferences, major architectural decisions
0.7 - 0.8Important -- solutions to significant problems, recurring patterns
0.5 - 0.6Moderate -- useful context, environment facts
0.1 - 0.4Minor -- implementation details, one-off notes

Manually stored memories default to 0.9 relevance. Automatically extracted memories have their relevance set by the extraction model based on significance.

The context injection endpoint sorts by relevance first, then recency. This means a highly relevant memory from weeks ago will be injected over a low-relevance memory from yesterday.

Access Tracking

Every time a memory is retrieved (through search or context injection), the system updates:

  • access_count -- incremented by 1
  • last_accessed_at -- set to the current timestamp

This data supports future decay and expiration logic. Frequently accessed memories are demonstrably useful; memories that are never accessed may be candidates for eventual cleanup.

The access count updates happen within the same database query as retrieval and are wrapped in a try/catch so a failure to update counts never blocks the memory response.

Memory Lifecycle

Creation                Access              Aging
   |                      |                   |
   v                      v                   v
Extracted from      Retrieved by search   access_count tracks
conversation or     or context injection  usage frequency;
stored manually     -> count incremented  last_accessed_at
                    -> timestamp updated  enables future decay

Memories are currently permanent once created. The access tracking infrastructure is in place to support future features like:

  • Decay -- gradually reducing relevance scores for memories that are never accessed
  • Expiration -- archiving or removing stale memories after a configurable period
  • Consolidation -- merging overlapping memories into summaries

Searching Memories

The /memory skill provides several search modes:

CommandWhat It Finds
search <query>Full-text search across content and source excerpts
about <topic>Everything known about a specific topic
decisionsAll decision-type memories
recent [N]The N most recent memories
contextPreview of what would be injected at session start

All search commands support filtering by --type and --project flags.

Use /memory context to see exactly what Claude will receive at the start of its next session. This is useful for verifying that important context is being captured and prioritized correctly.

API Endpoints

The memory system is backed by four API endpoints. For request/response schemas and authentication details, see the API reference.

MethodEndpointPurpose
GET/api/memory/searchQuery memories with filters
GET/api/memory/contextGet formatted context for session injection
POST/api/memory/storeStore a single memory chunk
POST/api/memory/extractExtract memories from a transcript