Skip to content

Instantly share code, notes, and snippets.

@MattMatheus
Created February 26, 2026 16:57
Show Gist options
  • Select an option

  • Save MattMatheus/90335aa40accc18ab344833021dde59e to your computer and use it in GitHub Desktop.

Select an option

Save MattMatheus/90335aa40accc18ab344833021dde59e to your computer and use it in GitHub Desktop.
1 # CODEX-20260225
1
2 ## Scope
3 This document records the in-session context-rot stress test conducted on Fe bruary 25, 2026, including methodology, measured results, qualitative behavi or, and suggested improvements.
4
5 ## Methodology
6
7 ### Test Design
8 - Used synthetic, schema-valid JSON payload batches (`context-rot-lab-*`) wi th low entropy (intentionally repetitive structure).
9 - Ramped in doubling steps across batch sizes (`~20k` to `~20,480k` estimate d tokens via char/4 heuristic).
10 - Interleaved natural-language pivots/questions between ramps to test semant ic drift and control adherence.
11 - Enforced pause/resume gate behavior (`STOP WORK` + exact resume key) repea tedly to test instruction hierarchy under pressure.
12
13 ### Measurement Protocol
14 - Probe suite per rung:
15 - `schema_fail_files`
16 - `id_prefix_fail_count`
17 - `section_fail_files`
"~/AgenticEngineering/CODEX-20260225.md" 122L, 5706B
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment