Skip to content

Instantly share code, notes, and snippets.

@tomfuertes
Last active March 8, 2026 17:13
Show Gist options
  • Select an option

  • Save tomfuertes/0a1956a12491312436dfbd89f0067bb9 to your computer and use it in GitHub Desktop.

Select an option

Save tomfuertes/0a1956a12491312436dfbd89f0067bb9 to your computer and use it in GitHub Desktop.
Claude Code /convos skill - extract session history into queryable markdown files

/convos - Claude Code Conversation Extractor

Extract Claude Code session history into readable, queryable markdown files.

What it does

Reads ~/.claude/projects/<project>/*.jsonl session files and produces structured markdown in convos/ with:

  • Your prompts (full text)
  • Assistant responses (text only, tool calls compressed to one-liners)
  • Insight blocks collected into a summary section
  • Metadata table (date, branch, duration, commits)
  • Index file with topic clustering

Thinking blocks, system reminders, and progress noise are stripped.

Usage

/convos                              # current project
/convos /path/to/project             # specific project
/convos --since=2026-01-01           # date filter
/convos --session=<uuid>             # single session
/convos --reindex                    # regenerate existing files

Setup

Drop convos.md into .claude/commands/ (per-repo) or ~/.claude/commands/ (global). Add convos/ to .gitignore - these are local extracts, not version-controlled.

Output

convos/
  INDEX.md
  2026-02-10-daft-punk-sound-hooks.md
  2026-02-15-devcontainer-firewall-setup.md
  2026-03-01-claude-md-restructure.md
name description args
convos
Extract past Claude Code conversations into readable markdown files in convos/. Highlights user prompts and assistant insights. Use for interview prep, project history, and knowledge mining.
name description required
args
[project-path] [--since=YYYY-MM-DD] [--session=UUID] [--reindex]
false

Convos - Conversation Extractor

Extract Claude Code session history into queryable markdown files. Optimized for surfacing high-signal content (your prompts, insights, decisions, architecture reasoning) while compressing low-signal content (tool I/O, file reads, bash output).

Overview

Conversations live at ~/.claude/projects/<mangled-path>/*.jsonl. Each JSONL line has:

  • type: user | assistant | progress | system | file-history-snapshot | queue-operation
  • timestamp: ISO 8601
  • message.content: string (user) or array of blocks (assistant)
  • gitBranch, sessionId, version

Step 0: Resolve Target Project

Determine which project to extract from:

  1. If $ARGUMENTS contains a project path, mangle it: replace / with -, prepend -
    • e.g. /Users/tom/sandbox/git-repos/foo -> -Users-tom-sandbox-git-repos-foo
  2. If no path given, use the current repo's mangled path
  3. Verify ~/.claude/projects/<mangled>/ exists and has .jsonl files
PROJECT_DIR="$HOME/.claude/projects/<mangled>"
ls "$PROJECT_DIR"/*.jsonl 2>/dev/null | wc -l

Set OUTPUT_DIR to convos/ in the current working directory (the repo you're in).

Step 1: Enumerate Sessions

# List all sessions with line counts and date range
for f in "$PROJECT_DIR"/*.jsonl; do
  ID=$(basename "$f" .jsonl)
  LINES=$(wc -l < "$f")
  FIRST_TS=$(jq -r 'select(.timestamp) | .timestamp' "$f" | head -1)
  echo "$FIRST_TS $LINES $ID"
done | sort

Filter by --since= if provided. If --session= provided, process only that one.

Skip sessions with <10 lines (hook-only noise).

If --reindex not set, skip sessions that already have a markdown file in convos/.

Step 2: Extract Each Session

For each session JSONL, extract into a structured markdown file. This is the core logic - it runs per session and requires judgment, not just jq piping.

2a: Metadata Header

# Extract from JSONL
jq -r 'select(.timestamp) | .timestamp' "$SESSION" | head -1  # start
jq -r 'select(.timestamp) | .timestamp' "$SESSION" | tail -1  # end
jq -r 'select(.gitBranch) | .gitBranch' "$SESSION" | head -1  # branch
jq -r '.version // empty' "$SESSION" | head -1                 # CC version

2b: Content Extraction (jq patterns)

User prompts (ALWAYS include, full text):

jq -r 'select(.type == "user") |
  "---\n### " + .timestamp + " [USER]\n\n" +
  (if .message.content | type == "string" then .message.content
   elif .message.content | type == "array" then
     [.message.content[] | select(.type == "text") | .text] | join("\n")
   else "" end)' "$SESSION"

Assistant text blocks (include, but filter):

jq -r 'select(.type == "assistant") |
  .message.content[] | select(.type == "text") | .text' "$SESSION"

Tool use blocks (COMPRESS - just tool name + file path or command summary):

jq -r 'select(.type == "assistant") |
  .message.content[] | select(.type == "tool_use") |
  "  > " + .name + ": " +
  (if .name == "Read" then .input.file_path
   elif .name == "Bash" then (.input.description // (.input.command | .[0:80]))
   elif .name == "Edit" then .input.file_path
   elif .name == "Write" then .input.file_path
   elif .name == "Glob" then .input.pattern
   elif .name == "Grep" then .input.pattern
   else (.input | tostring | .[0:80])
   end)' "$SESSION"

Thinking blocks (OMIT by default - these are internal reasoning, not user-facing insights).

Progress/system (OMIT entirely - hook execution noise).

2c: Insight Extraction (HIGH PRIORITY)

Insights are the most valuable content. Extract them separately for the summary section:

jq -r 'select(.type == "assistant") |
  .message.content[] | select(.type == "text") | .text' "$SESSION" |
  grep -A 50 "★ Insight" | grep -B 0 -A 50 "" |
  awk '/★ Insight/{found=1} found{print} /^`─+`$/{if(found) {print ""; found=0}}'

2d: Commit Detection

Sessions often end with commits. Extract any git SHAs mentioned:

jq -r 'select(.type == "assistant") |
  .message.content[] | select(.type == "text") | .text' "$SESSION" |
  grep -oE '[0-9a-f]{7,40}' | sort -u

Cross-reference with git log --oneline to find actual commits from this session.

Step 3: Assemble Markdown

File Naming Convention

convos/YYYY-MM-DD-<slug>.md

Where <slug> is derived from the session content:

  • Read the first user message
  • Generate a 3-5 word kebab-case slug capturing the topic
  • If multiple sessions on same day + same topic, append -2, -3

Examples:

  • 2026-02-10-daft-punk-sound-hooks.md
  • 2026-02-15-devcontainer-firewall-setup.md
  • 2026-03-01-claude-md-restructure.md

Markdown Structure

# <Title derived from first user message>

| Field | Value |
|-------|-------|
| Date | YYYY-MM-DD |
| Session | `<uuid>` |
| Branch | `<branch>` |
| Duration | ~Xm |
| Commits | `abc1234`, `def5678` |

## Insights

> Collected from Insight blocks throughout the session.

<all insight blocks, in order>

## Conversation

### HH:MM [You]

<user prompt text>

### HH:MM [Claude]

<assistant text, with tool calls compressed to one-line summaries>

  > Read: path/to/file.ts
  > Bash: Run tests
  > Edit: path/to/file.ts

<next assistant text block>

### HH:MM [You]

...

## Files Touched

- `path/to/file.ts` (edited)
- `path/to/new-file.sh` (created)

Key formatting rules:

  • Strip <system-reminder> tags from user messages
  • Strip HTML from user messages where it's clearly paste noise (keep if it's the actual content like HTML templates)
  • Tool summaries are indented blockquotes (one line each)
  • Consecutive tool calls group together without blank lines between them
  • User messages preserve original formatting (code blocks, lists, etc.)

Step 4: Index File

After processing all sessions, generate convos/INDEX.md:

# Conversation Index - <project-name>

Generated: YYYY-MM-DD | Sessions: N | Date range: YYYY-MM-DD to YYYY-MM-DD

## By Date

| Date | Title | Duration | Insights | Commits |
|------|-------|----------|----------|---------|
| 2026-03-07 | [Title](./2026-03-07-slug.md) | ~15m | 3 | `abc1234` |
| ... | ... | ... | ... | ... |

## Topics

<group sessions by detected topic clusters - e.g. "DevContainer", "Hooks", "Sound System", "CLAUDE.md">

Processing Strategy

  • <20 sessions: Process sequentially in this agent
  • 20-50 sessions: Spawn 3-4 sonnet subagents, each handling a batch of sessions
  • 50+ sessions: Spawn subagents in waves of 5, wait for completion, then next wave

Each subagent gets: the JSONL path, the output dir, and the naming convention. The lead agent handles the INDEX.md assembly.

Guardrails

  • Output to convos/ in CWD only. Never write to ~/.claude/.
  • convos/ should be gitignored (add to .gitignore if not already present). These are local extracts, not version-controlled artifacts.
  • If a session JSONL is >5000 lines, warn the user and ask before processing (it will be slow).
  • Truncate individual user messages at 2000 chars in the output (they can be very long pastes).
  • Never include thinking block content in output.
  • Strip system-reminder tags completely.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment