Session Handoff

Generated: 2026-01-04 Session Focus: Timeline Curation System with Preference-Based Classification Previous handoff: merged - preserved API context, added curation workflow

MANDATORY FIRST STEPS

Read PREFERENCES.md - contains all classification rules and user profile
Run git status to verify current state
Test cookie auth: uv run python tests/twitter_client/foryou.py 1
Verify bookmark folder access: see bookmark_to_folder.py

Goal

Build a personal Twitter curation assistant that:

Fetches "For You" and "Following" timelines
Classifies tweets as relevant/skip using learned preferences
Rewrites hype-wrapped content to neutral/concise style
Bookmarks selected tweets to "Shitout" folder
Eventually: Telegram integration, slash command /digest

Todo Status

Fetch "For You" timeline via twitter-api-client
Fetch bookmarks and analyze for preferences
Create PREFERENCES.md with classification rules
Build thread fetcher (tweets_details endpoint)
Iterative testing (5 rounds) to calibrate preferences
Bookmark to folder functionality (Shitout folder)
Create /digest slash command
Telegram bot integration
Level 2: Fetch linked content (arxiv, articles, other tweets)

Current State

Active file: tests/twitter_client/bookmark_to_folder.py
Active task: Successfully bookmarking tweets to Shitout folder
Last test: Test 5 - 12 surfaced, 1 rewritten, 17 skipped (user approved)
Blocker: None - system is working

Session Work Log

Files Changed

File	Action	Purpose
`tests/twitter_client/thread.py`	created	Fetch complete thread via tweets_details
`tests/twitter_client/bookmarks.py`	created	Fetch bookmarks via cookies, save to data/
`tests/twitter_client/bookmark_to_folder.py`	created	Bookmark tweets to Shitout folder
`PREFERENCES.md`	created	Classification rules, user profile, skip categories
`data/bookmarks.json`	created	Cached 100 bookmarks with metadata
`data/digest_2026-01-03_test*.md`	created	5 test digest files with user feedback

Key Files

/Users/unclecode/devs/tmp/shitout/
├── PREFERENCES.md                          # Classification rules (CRITICAL)
├── HANDOFF.md                              # This file
├── data/
│   ├── bookmarks.json                      # 100 bookmarks cached
│   └── digest_2026-01-03_test[1-5].md     # Test iterations with feedback
└── tests/twitter_client/
    ├── foryou.py                           # For You timeline
    ├── following.py                        # Following timeline
    ├── thread.py                           # Full thread fetcher
    ├── bookmarks.py                        # Fetch bookmarks
    └── bookmark_to_folder.py               # Add to Shitout folder

Design Decisions

Two-stage classification:
- Stage 1: Is this relevant to user? (binary)
- Stage 2: Does it need rewriting? (if relevant)
User profile is critical:
- Expert-level: CS, ML, web dev, data extraction, Crawl4AI author
- Skip anything below expertise level
- Skip reinventing the wheel (makes user angry)
Source categories:
- Trusted curators (surface as-is): @tom_doerr, @paulg
- Tech aggregators (rewrite, don't skip): @akshay_pachaar
- Major AI companies (surface new releases): Google, OpenAI, Anthropic, etc.
- Local LLM content: relevant to business
Shitout folder ID: 2007677808093606181

Issues & Solutions (CRITICAL)

Issue 1: tweets_by_id doesn't return thread

Symptom: Only got main tweet, not thread continuations
Root cause: tweets_by_id is single-tweet fetch
Solution: Use tweets_details endpoint - includes threaded_conversation_with_injections_v2

Issue 2: Bookmark folder variable name

Symptom: bookmark_collection_id has coerced Null value
Root cause: Used folder_id instead of bookmark_collection_id
Solution: variables = {"tweet_id": str(id), "bookmark_collection_id": folder_id}

Issue 3: Initial classification was badly calibrated

Symptom: Test 1 had 28% accuracy, surfaced irrelevant content
Root cause: Classified on content quality, not relevance to user
Solution: Added user expertise profile to PREFERENCES.md, filter by relevance first

Gotchas & Warnings

@akshay_pachaar: Content is GOOD, language is BAD → rewrite, don't skip
@googleaidevs: NOT "product marketing" → surface new releases
@RhysSullivan (Playwright): Skip - user is Crawl4AI author, knows Playwright deeply
@leerob "AI writes my code. Now what?": Vague BS → skip
DSPy content: User explicitly said skip

Failed Approaches (DO NOT REPEAT)

Tried: Classifying by topic categories (AI, OSS, etc.) Failed because: User doesn't care about topics, cares about quality and novelty relative to their expertise
Tried: Surface "builder made something" generically Failed because: remotosh (terminal streamer) is reinventing Termius/SSH - made user angry
Tried: Skip all product marketing from major companies Failed because: User wants to know what Google/OpenAI/etc are shipping

Code Context

Bookmark to Folder

from twitter.constants import Operation

def bookmark_to_folder(account, tweet_id: int, folder_id: str) -> dict:
    variables = {
        "tweet_id": str(tweet_id),
        "bookmark_collection_id": folder_id
    }
    return account.gql('POST', Operation.bookmarkTweetToFolder, variables)

Thread Fetcher

# Use tweets_details, not tweets_by_id
raw = scraper.tweets_details([int(tweet_id)])

# Navigate to thread entries
instructions = (item.get('data', {})
               .get('threaded_conversation_with_injections_v2', {})
               .get('instructions', []))

# Filter to author's tweets only (self-thread)
tweets = [t for t in tweets if t['user']['username'] == author_username]

Get Bookmark Folders

from twitter.constants import Operation
result = account.gql('GET', Operation.BookmarkFoldersSlice, {})
# Returns items with id, name, media

Full Tweet Content Fetch

# tweets_by_ids returns array in tweetResult
raw = scraper.tweets_by_ids(tweet_ids)
for item in raw:
    results = item.get('data', {}).get('tweetResult', [])
    for r in results:
        legacy = r.get('result', {}).get('legacy', {})
        text = legacy.get('full_text', '')

Resume Instructions

Read PREFERENCES.md to understand classification rules
Run test: uv run python tests/twitter_client/foryou.py 2
Apply preferences manually or build automation
Create digest markdown with surfaced tweets
Bookmark selected to Shitout: uv run python tests/twitter_client/bookmark_to_folder.py <ids>

To run another digest test:

# Fetch timeline
uv run python tests/twitter_client/foryou.py 3

# Manually classify based on PREFERENCES.md rules
# Create digest file in data/

# Bookmark selected tweets
uv run python tests/twitter_client/bookmark_to_folder.py 123456789 987654321

Environment Notes

Python: 3.11 (via uv)
Virtual env: .venv/
Package manager: uv
Test command: uv run python tests/twitter_client/<script>.py
Telegram bot token: in ~/devs/tmp/poly/.env (TELEGRAM_BOT_TOKEN, TELEGRAM_CHAT_ID)

Top Bookmarked Accounts (from 100 bookmarks)

Rank	Account	Count	Pattern
1	@tom_doerr	10	OSS repo curator
2	@claudeai	4	Claude/Anthropic official
3	@akshay_pachaar	3	Tech aggregator (rewrite)
4	@GithubProjects	3	OSS project shares
5	@alexalbert__	3	Anthropic team
6	@CodePen	3	Code demos
7	@omarsar0	2	ML resources
8	@UnslothAI	2	Fine-tuning tools

77 unique accounts - user has broad interests, no single dominant source.

PREFERENCES.md Summary (Read Full File)

User: @unclecode - CEO/CTO, Crawl4AI author
Expert in: CS, ML, web dev, data extraction, fine-tuning, infra

SURFACE:
- Quality OSS repos (tom_doerr, GithubProjects style)
- Claude/Anthropic content (@claudeai, @alexalbert__)
- Major AI company new releases (Google, OpenAI, etc.)
- Local LLM / fine-tuning content (@UnslothAI)
- ML resources (@omarsar0)
- paulg startup insights

REWRITE (don't skip):
- Tech aggregators (akshay_pachaar) - good content, bad language

SKIP:
- Below expertise level
- Reinventing the wheel
- Vague predictions/BS questions
- Video gen, 3D graphics, Gaussian splats
- DSPy, nanoGPT variants
- UI/UX libraries
- Insider jokes

Gist Created (User to Delete)

URL: https://gist.github.com/unclecode/7225a3aa77e0bc66887f45ffb2a38db5
Content: Test 5 digest
Action: User will ask to delete after review

unclecode/HANDOFF.md

Select an option

No results found

Select an option

No results found

Session Handoff

MANDATORY FIRST STEPS

Goal

Todo Status

Current State

Session Work Log

Files Changed

Key Files

Design Decisions

Issues & Solutions (CRITICAL)

Issue 1: tweets_by_id doesn't return thread

Issue 2: Bookmark folder variable name

Issue 3: Initial classification was badly calibrated

Gotchas & Warnings

Failed Approaches (DO NOT REPEAT)

Code Context

Bookmark to Folder

Thread Fetcher

Get Bookmark Folders

Full Tweet Content Fetch

Resume Instructions

To run another digest test:

Environment Notes

Top Bookmarked Accounts (from 100 bookmarks)

PREFERENCES.md Summary (Read Full File)

Gist Created (User to Delete)