Claude Code Session Management: Complete Architecture & Implementation

Executive Summary

Claude Code stores sessions in ~/.claude/projects/ as conversation.jsonl files. Currently, no tool adequately tracks forks, visualizes lineage, filters by directory, or exports to Markdown. This document outlines the gap, current tool landscape (2025-2026), a 9-dimensional evaluation framework, and a 4-week implementation roadmap.

Outcome: Build claude-session-manager package enabling fork tracing, visualization, directory filtering, and Markdown export.

Part 1: What Claude Code Actually Stores

File Structure

``` ~/.claude/projects/ ├── -home-user-project-src/ │ ├── conversation.jsonl │ ├── metadata.json (if exists) │ └── tools-log.jsonl (execution history) ├── -home-user-ai-agents/ │ └── conversation.jsonl └── [etc...] ```

Directory Name Encoding

Format: Hyphenated path starting with -
Example: -home-user-project-src = /home/user/project/src
Parsing: Split on -, prepend /, rejoin with /
Problem: Encoding breaks if directory moves or deletes

What's in conversation.jsonl

Each line is a JSON object representing one turn: ```json { "role": "user|assistant", "content": "...", "thinking": "...", "tokens": { "input": 1234, "output": 567 }, "timestamp": "2026-01-10T21:00:00Z", "tools_used": ["file_write", "search_web"], "model": "claude-3.5-sonnet" } ```

Key metadata: Timestamps, token counts, tool invocations, role separation.

Part 2: The Extraction Problem

Current Gaps

No Fork Tracing: Sessions created by editing+resuming are not linked. You don't know which session spawned which.
No Directory Filtering: Can't quickly find "all sessions launched from /home/user/project"
No Markdown Export: Conversation data trapped in JSONL, not suitable for documentation or git
No Visualization: Session relationships exist only in your head
No Cross-Session Search: Full-text search doesn't span related sessions

Why This Matters

Without this capability, you lose:

Decision traceability: Which experimental branch led to this solution?
Context inheritance: When forking a session, what implicit knowledge is lost?
Blame/history: Where did this bug originate? Which session introduced it?
Reusability: Am I repeating the same exploration in a sibling session?
Documentation: Sessions as living narrative aren't shareable

Part 3: Current Tool Ecosystem (Jan 2026)

Tools That Exist

claude-sessions (GitHub: Cryjaxxx/claude-sessions)

What: Slash commands for manual session documentation
Gap: Manual, no automated extraction or fork detection

awesome-claude-code (GitHub: hesreallyhim/awesome-claude-code)

Tools listed:
- claude-code-tools: Full-text search (Rust TUI), tmux integration
- Safe hooks for dangerous commands
Gap: Search-focused, no fork tracing or visualization

ccswitch (Reddit, July 2025)

What: Go CLI for git worktree session switching
Gap: Tracks git, not Claude metadata

LangGraph / CrewAI / Multi-Agent Frameworks (AWS, n8n, etc.)

What: Agent orchestration with session replay, topology management
Gap: Designed for agent-to-agent, not for Claude Code data extraction
Relevance: Future integration target for MCP-based handoffs

Tools Explicitly NOT Suitable

LangSmith: Only exports LangChain traces, not Claude Code
Haystack: Generic LLM metadata extraction, not Claude-specific
GitHub Issue Agents (MAGIS): Multi-agent framework for GitHub issues, not session tracking

Conclusion

No existing tool fully addresses your needs. This is a build opportunity, not a "find and integrate" problem.

Part 4: 9-Dimensional Session Management Survey

Use this framework to evaluate any session system (existing or planned).

Dimension 1: Architecture & Coordination

Score 0-5

Does the system treat forks as first-class objects?
Is session state tracked as DAG (Directed Acyclic Graph)?
Can multi-agent systems reference shared sessions?
Storage: Human-readable (JSON) or opaque (binary)?
Offline extraction support?

Dimension 2: Memory & State Management

Score 0-5

Separation of tactical (current turn) vs. strategic (cross-session) memory?
Token counts persisted per message?
Context window exhaustion tracking?
CLAUDE.md file parsing and indexing?
Role separation (system/user/assistant) tracked?

Dimension 3: Session Lifecycle Operations

Score 0-5

Create, resume, fork, pause, merge operations supported?
Checkpoint/snapshot capability?
Atomic state transitions?
Rollback/undo support?

Dimension 4: Discovery & Retrieval

Score 0-5

List sessions with filtering (date, directory, status)?
Full-text search across conversation history?
Fork lineage tracing (who forked from whom)?
Fast search on 1000+ sessions?

Dimension 5: Export & Integration

Score 0-5

Markdown export with metadata?
JSON/JSONL export?
Reimport/resume from export?
Git integration (commit messages, branches)?
MCP (Model Context Protocol) compatibility?

Dimension 6: Performance & Scalability

Score 0-5

List 1000+ sessions in <1s?
Search 100K+ messages in <5s?
Database indexing strategy?
Memory usage for large datasets?

Dimension 7: Reliability & Durability

Score 0-5

ACID compliance (atomic writes)?
Crash recovery?
Backup/restore?
Data corruption detection?
Versioning/rollback?

Dimension 8: Security & Privacy

Score 0-5

File permission enforcement?
Encryption at rest?
Access control (who can read/modify)?
Audit trails?
Sensitive data redaction?

Dimension 9: Developer Experience

Score 0-5

CLI simplicity (single command vs. menus)?
IDE integration (VS Code, Cursor)?
Error messages clarity?
Documentation quality?
Learning curve?

Scoring Guidance

0-1: Not present or severely broken 1-2: Partially present, major gaps 2-3: Present but limited 3-4: Well-implemented, minor gaps 4-5: Excellent, production-ready

Viability Score: Average of all 9 dimensions.

0-2: Not viable, fundamental gaps
2-3: Viable with significant work
3-4: Viable with polish
4-5: Production-ready

Part 5: Fork Detection Algorithm

The Core Problem

Claude Code doesn't natively track forks. A fork occurs when:

You open a previous session in Claude Code
You edit an earlier message in the conversation
You continue from that edited point (creating a new branch)

The fork data exists in conversation.jsonl but isn't labeled as such.

Detection Strategy

Step 1: Parse each session's conversation.jsonl Step 2: Scan for resume markers in assistant messages:

"resuming from previous session"
"continuing conversation"
"resuming context from"
"picking up where we left off"
"continuing from earlier"

Step 3: When detected, extract referenced session ID Step 4: Construct parent-child relationships Step 5: Build DAG (Directed Acyclic Graph) of all forks

Detection Code Pseudocode

```python def detect_forks(session_dir): """Scan conversation for fork markers.""" forks = [] with open(f"{session_dir}/conversation.jsonl") as f: for i, line in enumerate(f): msg = json.loads(line) if msg['role'] == 'assistant': content = msg.get('content', '').lower() if any(marker in content for marker in RESUME_MARKERS): # Extract parent session reference parent = extract_session_id(content) forks.append({ 'line_number': i, 'parent_session': parent, 'timestamp': msg['timestamp'] }) return forks

RESUME_MARKERS = [ "resuming from", "continuing conversation", "picking up where", "resuming context", ] ```

Why Fork Detection Matters

Without it, you lose:

Decision trees (which approach was tested where?)
Redundancy avoidance (am I exploring the same path twice?)
Blame tracing (which session introduced this bug?)
Intelligent archival (keep winners, prune failures)

Part 6: Implementation Roadmap

Phase 1: Foundation (Week 1 - 6 hours)

Goal: Functional extraction pipeline

Deliverables:

SessionMetadataExtractor class
- Parse ~/.claude/projects/*/conversation.jsonl
- Decode directory names to absolute paths
- Extract timestamps, message counts, token usage, tools invoked
- Identify resume markers
SQLiteSessionIndex class
- Schema: sessions, messages, forks, tags
- Indexes: (session_id, created_at), (directory), (full_text)
- Enable fast filtering and search
SessionMarkdownExporter class
- Convert conversation.jsonl → human-readable .md
- Include metadata header (duration, tokens, forks, tools)
- Preserve code blocks and thinking tags

Checkpoint: Can extract any session, index it, export to Markdown.

Phase 2: Visualization & Search (Weeks 2-3 - 12 hours)

Goal: Interactive discovery tools

Deliverables:

SessionTreeVisualizer class
- Graphviz DOT output showing fork relationships
- Color-coded by status (active, completed, archived)
- Render to SVG/PNG
CLI Tool: claude-session command ``` claude-session list [--dir=/path] [--days=7] claude-session search "pattern" claude-session export <id> [--format=md|json] claude-session tree <id> # Show fork hierarchy claude-session reindex # Rebuild SQLite index ```
Web Dashboard (optional)
- Gantt chart (timeline view)
- Network graph (fork relationships)
- Search interface with filters

Checkpoint: Can discover, filter, and visualize all sessions.

Phase 3: Production Package (Weeks 3-4 - 24 hours)

Goal: Installable, tested tool

Deliverables:

Python Package: claude-session-manager
- Setup.py with entry points
- CLI command registration
- Library imports for integration
MCP Integration
- Expose sessions as MCP resources
- Allow Claude to query its own metadata
- Enable agent-to-agent handoff
Testing & Documentation
- Unit tests for parser, indexer, exporter
- Integration tests for CLI
- Tutorial: "Extract Your First Session"
- API documentation

Checkpoint: pip install claude-session-manager works. Sessions are queryable by agents.

Part 7: Architectural Decisions

Why SQLite?

System	Pros	Cons	Verdict
SQLite	Zero setup, file-based, fast full-text, ACID	Not distributed	✅ Use for local Claude Code
PostgreSQL	Powerful, scales well	Requires server	❌ Overkill
Vector DB	Semantic search	Expensive, needs embeddings	❌ Not needed yet
Git	Version control, diff, blame	Doesn't store metadata	⚠️ Complement, don't replace

Decision: SQLite with full-text search (FTS5 extension) for keyword matching.

Why Graphviz?

Text-based: Version-control friendly
Wide support: Mermaid, PlantUML, online viewers
Embeddable: Include in docs, export as SVG/PNG
Simple: No JavaScript framework overhead

Session State Machine

``` NEW ↓ ACTIVE → RESUMED/FORKED ↓ ↓ COMPLETED ↓ ↓ ↓ ARCHIVED ←─┘ ```

Each transition logged with timestamp. Enables:

"Show me all completed sessions"
"Which sessions are in progress?"
"Session lifecycle metrics"

Part 8: Known Limitations & Mitigations

Limitation	Impact	Mitigation
Directory encoding fragility	Sessions orphaned if dir moves	Store absolute path at creation, track renames
Fork detection heuristic	Resume markers not guaranteed	Combine markers + timestamp analysis + manual linking UI
Thinking token opacity	Valuable context lost	Require Claude Code to export thinking in metadata
Cross-session context loss	Sibling sessions invisible	Implement context inheritance: copy CLAUDE.md, list siblings
Token count reliability	Usage tracking inaccurate	Cross-validate with Claude API logs if available

Part 9: Open Questions for Your Architecture

Session mutability: Are sessions immutable (like git commits) or mutable (like drafts)?
Export ownership: Does user keep copy, or system keeps canonical version?
Tagging granularity: Per-session, per-message, or per-code-block?
Multi-agent handoff: Do agents need explicit session handoff protocol?
Session templates: Reusable starting contexts for common tasks?
Context inheritance: When forking, should child inherit parent's CLAUDE.md?
Merge strategy: Can two sibling sessions be merged back?
Retention policy: Archive old sessions? Delete on disk?
Collaboration: Multiple users sharing session metadata?

Next Steps

Week 1: Start with Phase 1. Get extraction working. Prove data is extractable.
Weeks 2-3: Add visualization. Show fork hierarchies to your team.
Weeks 3-4: Package and document. Make it installable and shareable.

Success metric: "I can trace any Claude Code session back to its origin, see which sessions forked from it, and export the full conversation with one command."