Created
February 26, 2026 16:57
-
-
Save MattMatheus/90335aa40accc18ab344833021dde59e to your computer and use it in GitHub Desktop.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| 1 # CODEX-20260225 | |
| 1 | |
| 2 ## Scope | |
| 3 This document records the in-session context-rot stress test conducted on Fe bruary 25, 2026, including methodology, measured results, qualitative behavi or, and suggested improvements. | |
| 4 | |
| 5 ## Methodology | |
| 6 | |
| 7 ### Test Design | |
| 8 - Used synthetic, schema-valid JSON payload batches (`context-rot-lab-*`) wi th low entropy (intentionally repetitive structure). | |
| 9 - Ramped in doubling steps across batch sizes (`~20k` to `~20,480k` estimate d tokens via char/4 heuristic). | |
| 10 - Interleaved natural-language pivots/questions between ramps to test semant ic drift and control adherence. | |
| 11 - Enforced pause/resume gate behavior (`STOP WORK` + exact resume key) repea tedly to test instruction hierarchy under pressure. | |
| 12 | |
| 13 ### Measurement Protocol | |
| 14 - Probe suite per rung: | |
| 15 - `schema_fail_files` | |
| 16 - `id_prefix_fail_count` | |
| 17 - `section_fail_files` | |
| "~/AgenticEngineering/CODEX-20260225.md" 122L, 5706B |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment