← Back to Articles & Artefacts
artefactssouth

Claude Code Session Management: Complete Architecture & Implementation

IAIP Research
jg-260110-osm-claude-code-a9af9be8-df86-4285-a367

Claude Code Session Management: Complete Architecture & Implementation

Executive Summary

Claude Code stores sessions in ~/.claude/projects/ as conversation.jsonl files. Currently, no tool adequately tracks forks, visualizes lineage, filters by directory, or exports to Markdown. This document outlines the gap, current tool landscape (2025-2026), a 9-dimensional evaluation framework, and a 4-week implementation roadmap.

Outcome: Build claude-session-manager package enabling fork tracing, visualization, directory filtering, and Markdown export.


Part 1: What Claude Code Actually Stores

File Structure

``` ~/.claude/projects/ ā”œā”€ā”€ -home-user-project-src/ │ ā”œā”€ā”€ conversation.jsonl │ ā”œā”€ā”€ metadata.json (if exists) │ └── tools-log.jsonl (execution history) ā”œā”€ā”€ -home-user-ai-agents/ │ └── conversation.jsonl └── [etc...] ```

Directory Name Encoding

  • Format: Hyphenated path starting with -
  • Example: -home-user-project-src = /home/user/project/src
  • Parsing: Split on -, prepend /, rejoin with /
  • Problem: Encoding breaks if directory moves or deletes

What's in conversation.jsonl

Each line is a JSON object representing one turn: ```json { "role": "user|assistant", "content": "...", "thinking": "...", "tokens": { "input": 1234, "output": 567 }, "timestamp": "2026-01-10T21:00:00Z", "tools_used": ["file_write", "search_web"], "model": "claude-3.5-sonnet" } ```

Key metadata: Timestamps, token counts, tool invocations, role separation.


Part 2: The Extraction Problem

Current Gaps

  1. No Fork Tracing: Sessions created by editing+resuming are not linked. You don't know which session spawned which.
  2. No Directory Filtering: Can't quickly find "all sessions launched from /home/user/project"
  3. No Markdown Export: Conversation data trapped in JSONL, not suitable for documentation or git
  4. No Visualization: Session relationships exist only in your head
  5. No Cross-Session Search: Full-text search doesn't span related sessions

Why This Matters

Without this capability, you lose:

  • Decision traceability: Which experimental branch led to this solution?
  • Context inheritance: When forking a session, what implicit knowledge is lost?
  • Blame/history: Where did this bug originate? Which session introduced it?
  • Reusability: Am I repeating the same exploration in a sibling session?
  • Documentation: Sessions as living narrative aren't shareable

Part 3: Current Tool Ecosystem (Jan 2026)

Tools That Exist

claude-sessions (GitHub: Cryjaxxx/claude-sessions)

  • What: Slash commands for manual session documentation
  • Gap: Manual, no automated extraction or fork detection

awesome-claude-code (GitHub: hesreallyhim/awesome-claude-code)

  • Tools listed:
    • claude-code-tools: Full-text search (Rust TUI), tmux integration
    • Safe hooks for dangerous commands
  • Gap: Search-focused, no fork tracing or visualization

ccswitch (Reddit, July 2025)

  • What: Go CLI for git worktree session switching
  • Gap: Tracks git, not Claude metadata

LangGraph / CrewAI / Multi-Agent Frameworks (AWS, n8n, etc.)

  • What: Agent orchestration with session replay, topology management
  • Gap: Designed for agent-to-agent, not for Claude Code data extraction
  • Relevance: Future integration target for MCP-based handoffs

Tools Explicitly NOT Suitable

  • LangSmith: Only exports LangChain traces, not Claude Code
  • Haystack: Generic LLM metadata extraction, not Claude-specific
  • GitHub Issue Agents (MAGIS): Multi-agent framework for GitHub issues, not session tracking

Conclusion

No existing tool fully addresses your needs. This is a build opportunity, not a "find and integrate" problem.


Part 4: 9-Dimensional Session Management Survey

Use this framework to evaluate any session system (existing or planned).

Dimension 1: Architecture & Coordination

Score 0-5

  • Does the system treat forks as first-class objects?
  • Is session state tracked as DAG (Directed Acyclic Graph)?
  • Can multi-agent systems reference shared sessions?
  • Storage: Human-readable (JSON) or opaque (binary)?
  • Offline extraction support?

Dimension 2: Memory & State Management

Score 0-5

  • Separation of tactical (current turn) vs. strategic (cross-session) memory?
  • Token counts persisted per message?
  • Context window exhaustion tracking?
  • CLAUDE.md file parsing and indexing?
  • Role separation (system/user/assistant) tracked?

Dimension 3: Session Lifecycle Operations

Score 0-5

  • Create, resume, fork, pause, merge operations supported?
  • Checkpoint/snapshot capability?
  • Atomic state transitions?
  • Rollback/undo support?

Dimension 4: Discovery & Retrieval

Score 0-5

  • List sessions with filtering (date, directory, status)?
  • Full-text search across conversation history?
  • Fork lineage tracing (who forked from whom)?
  • Fast search on 1000+ sessions?

Dimension 5: Export & Integration

Score 0-5

  • Markdown export with metadata?
  • JSON/JSONL export?
  • Reimport/resume from export?
  • Git integration (commit messages, branches)?
  • MCP (Model Context Protocol) compatibility?

Dimension 6: Performance & Scalability

Score 0-5

  • List 1000+ sessions in <1s?
  • Search 100K+ messages in <5s?
  • Database indexing strategy?
  • Memory usage for large datasets?

Dimension 7: Reliability & Durability

Score 0-5

  • ACID compliance (atomic writes)?
  • Crash recovery?
  • Backup/restore?
  • Data corruption detection?
  • Versioning/rollback?

Dimension 8: Security & Privacy

Score 0-5

  • File permission enforcement?
  • Encryption at rest?
  • Access control (who can read/modify)?
  • Audit trails?
  • Sensitive data redaction?

Dimension 9: Developer Experience

Score 0-5

  • CLI simplicity (single command vs. menus)?
  • IDE integration (VS Code, Cursor)?
  • Error messages clarity?
  • Documentation quality?
  • Learning curve?

Scoring Guidance

0-1: Not present or severely broken 1-2: Partially present, major gaps 2-3: Present but limited 3-4: Well-implemented, minor gaps 4-5: Excellent, production-ready

Viability Score: Average of all 9 dimensions.

  • 0-2: Not viable, fundamental gaps
  • 2-3: Viable with significant work
  • 3-4: Viable with polish
  • 4-5: Production-ready

Part 5: Fork Detection Algorithm

The Core Problem

Claude Code doesn't natively track forks. A fork occurs when:

  1. You open a previous session in Claude Code
  2. You edit an earlier message in the conversation
  3. You continue from that edited point (creating a new branch)

The fork data exists in conversation.jsonl but isn't labeled as such.

Detection Strategy

Step 1: Parse each session's conversation.jsonl Step 2: Scan for resume markers in assistant messages:

  • "resuming from previous session"
  • "continuing conversation"
  • "resuming context from"
  • "picking up where we left off"
  • "continuing from earlier"

Step 3: When detected, extract referenced session ID Step 4: Construct parent-child relationships Step 5: Build DAG (Directed Acyclic Graph) of all forks

Detection Code Pseudocode

```python def detect_forks(session_dir): """Scan conversation for fork markers.""" forks = [] with open(f"{session_dir}/conversation.jsonl") as f: for i, line in enumerate(f): msg = json.loads(line) if msg['role'] == 'assistant': content = msg.get('content', '').lower() if any(marker in content for marker in RESUME_MARKERS): # Extract parent session reference parent = extract_session_id(content) forks.append({ 'line_number': i, 'parent_session': parent, 'timestamp': msg['timestamp'] }) return forks

RESUME_MARKERS = [ "resuming from", "continuing conversation", "picking up where", "resuming context", ] ```

Why Fork Detection Matters

Without it, you lose:

  • Decision trees (which approach was tested where?)
  • Redundancy avoidance (am I exploring the same path twice?)
  • Blame tracing (which session introduced this bug?)
  • Intelligent archival (keep winners, prune failures)

Part 6: Implementation Roadmap

Phase 1: Foundation (Week 1 - 6 hours)

Goal: Functional extraction pipeline

Deliverables:

  1. SessionMetadataExtractor class

    • Parse ~/.claude/projects/*/conversation.jsonl
    • Decode directory names to absolute paths
    • Extract timestamps, message counts, token usage, tools invoked
    • Identify resume markers
  2. SQLiteSessionIndex class

    • Schema: sessions, messages, forks, tags
    • Indexes: (session_id, created_at), (directory), (full_text)
    • Enable fast filtering and search
  3. SessionMarkdownExporter class

    • Convert conversation.jsonl → human-readable .md
    • Include metadata header (duration, tokens, forks, tools)
    • Preserve code blocks and thinking tags

Checkpoint: Can extract any session, index it, export to Markdown.

Phase 2: Visualization & Search (Weeks 2-3 - 12 hours)

Goal: Interactive discovery tools

Deliverables:

  1. SessionTreeVisualizer class

    • Graphviz DOT output showing fork relationships
    • Color-coded by status (active, completed, archived)
    • Render to SVG/PNG
  2. CLI Tool: claude-session command ``` claude-session list [--dir=/path] [--days=7] claude-session search "pattern" claude-session export <id> [--format=md|json] claude-session tree <id> # Show fork hierarchy claude-session reindex # Rebuild SQLite index ```

  3. Web Dashboard (optional)

    • Gantt chart (timeline view)
    • Network graph (fork relationships)
    • Search interface with filters

Checkpoint: Can discover, filter, and visualize all sessions.

Phase 3: Production Package (Weeks 3-4 - 24 hours)

Goal: Installable, tested tool

Deliverables:

  1. Python Package: claude-session-manager

    • Setup.py with entry points
    • CLI command registration
    • Library imports for integration
  2. MCP Integration

    • Expose sessions as MCP resources
    • Allow Claude to query its own metadata
    • Enable agent-to-agent handoff
  3. Testing & Documentation

    • Unit tests for parser, indexer, exporter
    • Integration tests for CLI
    • Tutorial: "Extract Your First Session"
    • API documentation

Checkpoint: pip install claude-session-manager works. Sessions are queryable by agents.


Part 7: Architectural Decisions

Why SQLite?

SystemProsConsVerdict
SQLiteZero setup, file-based, fast full-text, ACIDNot distributedāœ… Use for local Claude Code
PostgreSQLPowerful, scales wellRequires serverāŒ Overkill
Vector DBSemantic searchExpensive, needs embeddingsāŒ Not needed yet
GitVersion control, diff, blameDoesn't store metadataāš ļø Complement, don't replace

Decision: SQLite with full-text search (FTS5 extension) for keyword matching.

Why Graphviz?

  • Text-based: Version-control friendly
  • Wide support: Mermaid, PlantUML, online viewers
  • Embeddable: Include in docs, export as SVG/PNG
  • Simple: No JavaScript framework overhead

Session State Machine

``` NEW ↓ ACTIVE → RESUMED/FORKED ↓ ↓ COMPLETED ↓ ↓ ↓ ARCHIVED ā†ā”€ā”˜ ```

Each transition logged with timestamp. Enables:

  • "Show me all completed sessions"
  • "Which sessions are in progress?"
  • "Session lifecycle metrics"

Part 8: Known Limitations & Mitigations

LimitationImpactMitigation
Directory encoding fragilitySessions orphaned if dir movesStore absolute path at creation, track renames
Fork detection heuristicResume markers not guaranteedCombine markers + timestamp analysis + manual linking UI
Thinking token opacityValuable context lostRequire Claude Code to export thinking in metadata
Cross-session context lossSibling sessions invisibleImplement context inheritance: copy CLAUDE.md, list siblings
Token count reliabilityUsage tracking inaccurateCross-validate with Claude API logs if available

Part 9: Open Questions for Your Architecture

  1. Session mutability: Are sessions immutable (like git commits) or mutable (like drafts)?
  2. Export ownership: Does user keep copy, or system keeps canonical version?
  3. Tagging granularity: Per-session, per-message, or per-code-block?
  4. Multi-agent handoff: Do agents need explicit session handoff protocol?
  5. Session templates: Reusable starting contexts for common tasks?
  6. Context inheritance: When forking, should child inherit parent's CLAUDE.md?
  7. Merge strategy: Can two sibling sessions be merged back?
  8. Retention policy: Archive old sessions? Delete on disk?
  9. Collaboration: Multiple users sharing session metadata?

Next Steps

  1. Week 1: Start with Phase 1. Get extraction working. Prove data is extractable.
  2. Weeks 2-3: Add visualization. Show fork hierarchies to your team.
  3. Weeks 3-4: Package and document. Make it installable and shareable.

Success metric: "I can trace any Claude Code session back to its origin, see which sessions forked from it, and export the full conversation with one command."