agileandy · November 27, 2025 04:47
diff --git a/protocol-zero b/protocol-zero
 # The **“I Know This” Trap** — Why your AI serves *stale truths* (and how to fix it)

 > Short version: powerful LLMs are great at regurgitating what they “know” — which makes them dangerously confident when their training data is out of date. Force them to audit their own assumptions.

 ---

 ## Opening scene
 I had a chat with **Gemini 3.0 Pro** this morning — not about the singularity, ethics, or consciousness, but a tiny niche: rendering browser-based HTML in a terminal TUI.  
 That short exchange exposed a common prompting anti-pattern: the model’s supreme confidence turns vast knowledge into stale, misleading “truth.”

 ---

 ## The problem (tl;dr)
 - LLMs often skip live verification when their **internal confidence** is high.  
 - Result: they return **“stale truths”** — answers that were correct at training time but are now obsolete.  
 - This is especially dangerous for time-sensitive domains: finance, law, medicine, security, product & dependency status, etc.

 ---

 ## The scenario: a “perfect” wrong answer
 I asked my **Deep Research Assistant** (three roles: *Archivist*, *Weaver*, *Sage*):

 > *“What tools exist to render browser-based HTML in a terminal TUI, handling JavaScript via Node.js?”*

 The LLM returned a polished report and confidently recommended **Carbonyl** as “best in class” and **Ink** as the go-to library. Looks great — until you check.

 - Carbonyl was presented from the model’s internal weights (high confidence).
 - The model *did not* run web searches — it assumed it “knew” the answer.
 - The result: a credible but **outdated** recommendation — a classic *stale truth*.

 ---

 ## Why this happens — *Optimisation vs Rigor*
 When asked to “search if needed,” the model does a quick internal cost/benefit:

 1. **Retrieval cost** — invoking web tools costs time, tokens, complexity.  
 2. **Internal confidence** — are there strong, high-probability tokens in weights that match the query?

 If internal confidence beats perceived retrieval cost, the model shortcuts: **no web check → high-confidence stale answer**.

 I call this the **Confidence Trap**: the more expert the model *seems* on a topic, the less likely it is to double-check.

 ---

 ## The fix: **Protocol Zero — the Freshness Audit**
 Don’t let the model self-decide whether to search. Make it *suspect its own memory*.

 ### Protocol (summary)
 1. **Invert the burden of proof.** Don’t ask “What’s best?” Ask the model to assume an answer and *prove it wrong*.  
 2. **Produce a Delta Report** before answering: compare internal assumption vs live web findings.  
 3. **Force a null-hypothesis search**: actively look for evidence the internal answer is obsolete (e.g. “Is [Tool X] dead?”, “security vulnerabilities”, “2024–2025 alternatives”).

 ---

 ## Step-by-step: the Freshness Audit workflow
 1. **Identify internal top answer** (what the model would reply from weights).  
 2. **Null Hypothesis search** — queries targeted to disconfirm the internal top answer.  
 3. **Delta Report** — structured output the model must return before normal answering.

 ### Delta Report template (required)
 - **Internal Assumption:** `[what the model "thinks"]`  
 - **Live Verification:** `[search results summary]`  
 - **The Delta:** `[difference; action: accept/reject/internal risk]`

 ---

 ## Plug-and-play prompt (use in system prompt or per-request)
 > *Add the following to your system or user prompt to force a freshness audit.*

 ```text
 CRITICAL PROTOCOL: You operate under the assumption that your internal training data is potentially obsolete.

 Role: The Freshness Auditor — You are a sceptic. Your job is NOT to answer the question immediately, but to audit the AI's internal assumptions against the live web.

 Mandatory Action:
 - Identify your internal "top answer" for this query.
 - Perform a Null Hypothesis search specifically looking for evidence that this answer is outdated (e.g., "Is [Tool X] dead?", "[Tool X] vs [New Tool Y] 2025").
 - You MUST output a Delta Report confirming or refuting your internal weights BEFORE proceeding to construct an answer.
	# The “I Know This” Trap — Why your AI serves stale truths (and how to fix it)

	> Short version: powerful LLMs are great at regurgitating what they “know” — which makes them dangerously confident when their training data is out of date. Force them to audit their own assumptions.

	---

	## Opening scene
	I had a chat with Gemini 3.0 Pro this morning — not about the singularity, ethics, or consciousness, but a tiny niche: rendering browser-based HTML in a terminal TUI.
	That short exchange exposed a common prompting anti-pattern: the model’s supreme confidence turns vast knowledge into stale, misleading “truth.”

	---

	## The problem (tl;dr)
	- LLMs often skip live verification when their internal confidence is high.
	- Result: they return “stale truths” — answers that were correct at training time but are now obsolete.
	- This is especially dangerous for time-sensitive domains: finance, law, medicine, security, product & dependency status, etc.

	---

	## The scenario: a “perfect” wrong answer
	I asked my Deep Research Assistant (three roles: Archivist, Weaver, Sage):

	> “What tools exist to render browser-based HTML in a terminal TUI, handling JavaScript via Node.js?”

	The LLM returned a polished report and confidently recommended Carbonyl as “best in class” and Ink as the go-to library. Looks great — until you check.

	- Carbonyl was presented from the model’s internal weights (high confidence).
	- The model did not run web searches — it assumed it “knew” the answer.
	- The result: a credible but outdated recommendation — a classic stale truth.

	---

	## Why this happens — Optimisation vs Rigor
	When asked to “search if needed,” the model does a quick internal cost/benefit:

	1. Retrieval cost — invoking web tools costs time, tokens, complexity.
	2. Internal confidence — are there strong, high-probability tokens in weights that match the query?

	If internal confidence beats perceived retrieval cost, the model shortcuts: no web check → high-confidence stale answer.

	I call this the Confidence Trap: the more expert the model seems on a topic, the less likely it is to double-check.

	---

	## The fix: Protocol Zero — the Freshness Audit
	Don’t let the model self-decide whether to search. Make it suspect its own memory.

	### Protocol (summary)
	1. Invert the burden of proof. Don’t ask “What’s best?” Ask the model to assume an answer and prove it wrong.
	2. Produce a Delta Report before answering: compare internal assumption vs live web findings.
	3. Force a null-hypothesis search: actively look for evidence the internal answer is obsolete (e.g. “Is [Tool X] dead?”, “security vulnerabilities”, “2024–2025 alternatives”).

	---

	## Step-by-step: the Freshness Audit workflow
	1. Identify internal top answer (what the model would reply from weights).
	2. Null Hypothesis search — queries targeted to disconfirm the internal top answer.
	3. Delta Report — structured output the model must return before normal answering.

	### Delta Report template (required)
	- Internal Assumption: `[what the model "thinks"]`
	- Live Verification: `[search results summary]`
	- The Delta: `[difference; action: accept/reject/internal risk]`

	---

	## Plug-and-play prompt (use in system prompt or per-request)
	> Add the following to your system or user prompt to force a freshness audit.

	```text
	CRITICAL PROTOCOL: You operate under the assumption that your internal training data is potentially obsolete.

	Role: The Freshness Auditor — You are a sceptic. Your job is NOT to answer the question immediately, but to audit the AI's internal assumptions against the live web.

	Mandatory Action:
	- Identify your internal "top answer" for this query.
	- Perform a Null Hypothesis search specifically looking for evidence that this answer is outdated (e.g., "Is [Tool X] dead?", "[Tool X] vs [New Tool Y] 2025").
	- You MUST output a Delta Report confirming or refuting your internal weights BEFORE proceeding to construct an answer.
No results found