Memory System
Why Memory Matters
Claude is stateless. Every session starts with a blank context window. Without memory, Claude would re-discover the same facts, re-ask the same questions, and forget every decision you have made together.
Cognova solves this with a persistent memory layer that automatically captures important information from conversations and injects it back at the start of each session. The result is an agent that genuinely knows you over time.
Memory Types
Each memory is categorized by type, which determines how it is displayed and prioritized.
| Type | Icon | Use Case | Example |
|---|---|---|---|
decision | [D] | Choices about architecture, tools, or approach | "Using Caddy instead of Nginx for reverse proxy" |
fact | [F] | Important information about the environment | "Deploy script requires sudo on Linux" |
solution | [S] | How a problem was solved | "Fixed CORS by adding origin to Nitro route rules" |
pattern | [P] | Recurring conventions | "Server imports use ~~/server/ alias, not relative paths" |
preference | [*] | User preferences for style, tools, workflows | "User prefers pnpm over npm or yarn" |
summary | [~] | Session summaries and high-level context | "Completed vault sync feature with file watcher" |
How Memories Are Created
Memories enter the system through three paths.
1. Automatic Extraction (Hooks)
The stop-extract and pre-compact hooks send conversation transcripts to the memory extraction API. The server runs a separate Claude instance with a specialized prompt that identifies memories worth preserving.
The extraction prompt instructs Claude to:
- Extract only genuinely useful information
- Skip routine acknowledgments, greetings, and debugging steps
- Keep content concise (max 100 characters)
- Assign a relevance score from 0.0 to 1.0
- Return an empty array if nothing is worth extracting
The extractor processes the last 20 messages of the transcript (up to 8,000 characters) and returns structured JSON:
[
{ "type": "decision", "content": "Using Drizzle ORM instead of Prisma", "relevance": 0.9 },
{ "type": "preference", "content": "No semicolons in TypeScript", "relevance": 0.8 }
]
2. Manual Storage (Skill)
Claude stores memories explicitly using the /memory store command when the user shares something important:
/memory store --type preference "User is a fullstack developer focused on TypeScript"
/memory store --type decision "All database migrations use Drizzle Kit"
/memory store --type fact "Production server is Ubuntu 24.04 on Hetzner"
Claude is instructed to store memories immediately when the user reveals preferences, makes decisions, solves problems, or shares facts about their environment.
3. Onboarding
On the first session (when no memories exist), Claude runs an onboarding flow -- asking about the user's role, tech stack, goals, and preferences. Each response is stored as a separate memory and a ## User Profile section is appended to CLAUDE.md as a fallback.
How Memories Are Injected
At session start, the session-start hook calls GET /api/memory/context, which:
- Queries the
memory_chunkstable ordered by relevance score (descending), then by creation date (descending) - Returns the top N memories (default: 5, max: 20)
- Increments the
access_countand updateslast_accessed_atfor each returned memory - Groups memories by type and formats them as markdown
The formatted output is printed to stdout, which Claude receives as injected context:
5 memories loaded:
### Preferences
- User prefers Tailwind CSS with the default color palette
- No semicolons in TypeScript
### Decisions
- Using PostgreSQL for all persistent data
- Caddy for reverse proxy with automatic HTTPS
### Key Facts
- Production runs on Ubuntu 24.04 at Hetzner
Relevance and Scoring
Every memory has a relevanceScore between 0.0 and 1.0:
| Score | Meaning |
|---|---|
| 0.9 - 1.0 | Critical -- core user preferences, major architectural decisions |
| 0.7 - 0.8 | Important -- solutions to significant problems, recurring patterns |
| 0.5 - 0.6 | Moderate -- useful context, environment facts |
| 0.1 - 0.4 | Minor -- implementation details, one-off notes |
Manually stored memories default to 0.9 relevance. Automatically extracted memories have their relevance set by the extraction model based on significance.
The context injection endpoint sorts by relevance first, then recency. This means a highly relevant memory from weeks ago will be injected over a low-relevance memory from yesterday.
Access Tracking
Every time a memory is retrieved (through search or context injection), the system updates:
access_count-- incremented by 1last_accessed_at-- set to the current timestamp
This data supports future decay and expiration logic. Frequently accessed memories are demonstrably useful; memories that are never accessed may be candidates for eventual cleanup.
The access count updates happen within the same database query as retrieval and are wrapped in a try/catch so a failure to update counts never blocks the memory response.
Memory Lifecycle
Creation Access Aging
| | |
v v v
Extracted from Retrieved by search access_count tracks
conversation or or context injection usage frequency;
stored manually -> count incremented last_accessed_at
-> timestamp updated enables future decay
Memories are currently permanent once created. The access tracking infrastructure is in place to support future features like:
- Decay -- gradually reducing relevance scores for memories that are never accessed
- Expiration -- archiving or removing stale memories after a configurable period
- Consolidation -- merging overlapping memories into summaries
Searching Memories
The /memory skill provides several search modes:
| Command | What It Finds |
|---|---|
search <query> | Full-text search across content and source excerpts |
about <topic> | Everything known about a specific topic |
decisions | All decision-type memories |
recent [N] | The N most recent memories |
context | Preview of what would be injected at session start |
All search commands support filtering by --type and --project flags.
Use /memory context to see exactly what Claude will receive at the start of its next session. This is useful for verifying that important context is being captured and prioritized correctly.
API Endpoints
The memory system is backed by four API endpoints. For request/response schemas and authentication details, see the API reference.
| Method | Endpoint | Purpose |
|---|---|---|
GET | /api/memory/search | Query memories with filters |
GET | /api/memory/context | Get formatted context for session injection |
POST | /api/memory/store | Store a single memory chunk |
POST | /api/memory/extract | Extract memories from a transcript |