| name | description | args | |||||||
|---|---|---|---|---|---|---|---|---|---|
convos |
Extract past Claude Code conversations into readable markdown files in convos/. Highlights user prompts and assistant insights. Use for interview prep, project history, and knowledge mining. |
|
Extract Claude Code session history into queryable markdown files. Optimized for surfacing high-signal content (your prompts, insights, decisions, architecture reasoning) while compressing low-signal content (tool I/O, file reads, bash output).
Conversations live at ~/.claude/projects/<mangled-path>/*.jsonl. Each JSONL line has:
type:user|assistant|progress|system|file-history-snapshot|queue-operationtimestamp: ISO 8601message.content: string (user) or array of blocks (assistant)gitBranch,sessionId,version
Determine which project to extract from:
- If
$ARGUMENTScontains a project path, mangle it: replace/with-, prepend-- e.g.
/Users/tom/sandbox/git-repos/foo->-Users-tom-sandbox-git-repos-foo
- e.g.
- If no path given, use the current repo's mangled path
- Verify
~/.claude/projects/<mangled>/exists and has.jsonlfiles
PROJECT_DIR="$HOME/.claude/projects/<mangled>"
ls "$PROJECT_DIR"/*.jsonl 2>/dev/null | wc -lSet OUTPUT_DIR to convos/ in the current working directory (the repo you're in).
# List all sessions with line counts and date range
for f in "$PROJECT_DIR"/*.jsonl; do
ID=$(basename "$f" .jsonl)
LINES=$(wc -l < "$f")
FIRST_TS=$(jq -r 'select(.timestamp) | .timestamp' "$f" | head -1)
echo "$FIRST_TS $LINES $ID"
done | sortFilter by --since= if provided. If --session= provided, process only that one.
Skip sessions with <10 lines (hook-only noise).
If --reindex not set, skip sessions that already have a markdown file in convos/.
For each session JSONL, extract into a structured markdown file. This is the core logic - it runs per session and requires judgment, not just jq piping.
# Extract from JSONL
jq -r 'select(.timestamp) | .timestamp' "$SESSION" | head -1 # start
jq -r 'select(.timestamp) | .timestamp' "$SESSION" | tail -1 # end
jq -r 'select(.gitBranch) | .gitBranch' "$SESSION" | head -1 # branch
jq -r '.version // empty' "$SESSION" | head -1 # CC versionUser prompts (ALWAYS include, full text):
jq -r 'select(.type == "user") |
"---\n### " + .timestamp + " [USER]\n\n" +
(if .message.content | type == "string" then .message.content
elif .message.content | type == "array" then
[.message.content[] | select(.type == "text") | .text] | join("\n")
else "" end)' "$SESSION"Assistant text blocks (include, but filter):
jq -r 'select(.type == "assistant") |
.message.content[] | select(.type == "text") | .text' "$SESSION"Tool use blocks (COMPRESS - just tool name + file path or command summary):
jq -r 'select(.type == "assistant") |
.message.content[] | select(.type == "tool_use") |
" > " + .name + ": " +
(if .name == "Read" then .input.file_path
elif .name == "Bash" then (.input.description // (.input.command | .[0:80]))
elif .name == "Edit" then .input.file_path
elif .name == "Write" then .input.file_path
elif .name == "Glob" then .input.pattern
elif .name == "Grep" then .input.pattern
else (.input | tostring | .[0:80])
end)' "$SESSION"Thinking blocks (OMIT by default - these are internal reasoning, not user-facing insights).
Progress/system (OMIT entirely - hook execution noise).
Insights are the most valuable content. Extract them separately for the summary section:
jq -r 'select(.type == "assistant") |
.message.content[] | select(.type == "text") | .text' "$SESSION" |
grep -A 50 "★ Insight" | grep -B 0 -A 50 "★" |
awk '/★ Insight/{found=1} found{print} /^`─+`$/{if(found) {print ""; found=0}}'Sessions often end with commits. Extract any git SHAs mentioned:
jq -r 'select(.type == "assistant") |
.message.content[] | select(.type == "text") | .text' "$SESSION" |
grep -oE '[0-9a-f]{7,40}' | sort -uCross-reference with git log --oneline to find actual commits from this session.
convos/YYYY-MM-DD-<slug>.md
Where <slug> is derived from the session content:
- Read the first user message
- Generate a 3-5 word kebab-case slug capturing the topic
- If multiple sessions on same day + same topic, append
-2,-3
Examples:
2026-02-10-daft-punk-sound-hooks.md2026-02-15-devcontainer-firewall-setup.md2026-03-01-claude-md-restructure.md
# <Title derived from first user message>
| Field | Value |
|-------|-------|
| Date | YYYY-MM-DD |
| Session | `<uuid>` |
| Branch | `<branch>` |
| Duration | ~Xm |
| Commits | `abc1234`, `def5678` |
## Insights
> Collected from Insight blocks throughout the session.
<all insight blocks, in order>
## Conversation
### HH:MM [You]
<user prompt text>
### HH:MM [Claude]
<assistant text, with tool calls compressed to one-line summaries>
> Read: path/to/file.ts
> Bash: Run tests
> Edit: path/to/file.ts
<next assistant text block>
### HH:MM [You]
...
## Files Touched
- `path/to/file.ts` (edited)
- `path/to/new-file.sh` (created)- Strip
<system-reminder>tags from user messages - Strip HTML from user messages where it's clearly paste noise (keep if it's the actual content like HTML templates)
- Tool summaries are indented blockquotes (one line each)
- Consecutive tool calls group together without blank lines between them
- User messages preserve original formatting (code blocks, lists, etc.)
After processing all sessions, generate convos/INDEX.md:
# Conversation Index - <project-name>
Generated: YYYY-MM-DD | Sessions: N | Date range: YYYY-MM-DD to YYYY-MM-DD
## By Date
| Date | Title | Duration | Insights | Commits |
|------|-------|----------|----------|---------|
| 2026-03-07 | [Title](./2026-03-07-slug.md) | ~15m | 3 | `abc1234` |
| ... | ... | ... | ... | ... |
## Topics
<group sessions by detected topic clusters - e.g. "DevContainer", "Hooks", "Sound System", "CLAUDE.md">- <20 sessions: Process sequentially in this agent
- 20-50 sessions: Spawn 3-4 sonnet subagents, each handling a batch of sessions
- 50+ sessions: Spawn subagents in waves of 5, wait for completion, then next wave
Each subagent gets: the JSONL path, the output dir, and the naming convention. The lead agent handles the INDEX.md assembly.
- Output to
convos/in CWD only. Never write to~/.claude/. convos/should be gitignored (add to.gitignoreif not already present). These are local extracts, not version-controlled artifacts.- If a session JSONL is >5000 lines, warn the user and ask before processing (it will be slow).
- Truncate individual user messages at 2000 chars in the output (they can be very long pastes).
- Never include thinking block content in output.
- Strip system-reminder tags completely.