A genius alien with PhDs in every field just landed on Earth and is now your intern. They can figure out anything... but they have NO idea what "good" looks like in your business context, what your priorities are, or which of the infinite possible approaches you'd prefer. And they're cursed to execute whatever you ask, even if your instructions are vague.
This framework is your guide to communicating with that genius alien intern.
When you prompt an AI, you're essentially communicating with a genius intern who:
- Can figure things out but can't read your mind
- Needs to know what success looks like for YOU specifically
- Benefits from clear boundaries without being micromanaged
- Learns from examples that show both what to do and what not to do
Most prompts fail because they're either:
- Over-specified: Mechanically prescriptive, treating the AI like it can't think
- Under-specified: Vague and ambiguous, forcing the AI to hedge or guess
The key is matching your prompt's complexity to the judgment required.
Every prompt falls into one of three categories, and each requires a different approach:
Execute this specific thing now. Single-use, context-specific requests.
Examples:
- "Write a competitive analysis for our board meeting"
- "Debug this code and fix the authentication bug"
- "Create a presentation about Q3 results"
Focus: Execution quality for this specific instance
Learn how to use this tool, system, or workflow for repeated future use.
Examples:
- Tool descriptions (TodoWrite, AskUserQuestion)
- Workflow guides ("How to create presentations")
- System documentation ("How to use our deployment system")
Focus: Teaching a reusable capability, not executing right now
Acquire domain knowledge before execution. Used when the concept is outside the AI's training data or requires deep contextual understanding.
Examples:
- "Learn our MCP protocol, then build an MCP server"
- "Learn our company's data schema, then write integration code"
- "Learn this medical coding system, then process these claims"
Focus: Knowledge acquisition → application, with explicit learning phases
When to use Learning Journey structure:
- The domain/concept is NOT in the AI's training data (new frameworks, emerging technologies)
- Company-specific knowledge (internal tools, proprietary systems, custom schemas)
- Specialized domain knowledge requiring deep understanding (legal frameworks, medical protocols)
While this framework treats AI like a smart human collaborator, there's one critical difference: AIs benefit from structural markup that would annoy humans.
Use XML-style tags to create clear boundaries:
<example>and</example>for examples<reasoning>for explaining why something is good/bad<context>,<constraints>,<common_mistakes>to delineate sections- Nested tags for hierarchical information
This markup helps AIs parse what's a rule vs. an example vs. meta-commentary. Use it judiciously when structure matters.
These principles apply to ALL three types of prompts:
Not every prompt needs the full framework.
Simple tasks (self-explanatory, one obvious approach, hard to mess up) need simple instructions: purpose + use cases + constraints.
Complex tasks (multiple valid approaches, many decision points, well-known failure modes) need the full treatment: examples with reasoning, when-to/when-not, common mistakes.
Most tasks fall somewhere in between. Use your judgment.
The test: Would a smart human need this level of detail, or would they find it condescending?
The worst prompts hedge when they should be decisive:
- ❌ "Maybe make it more concise... if you think that's good?"
- ✅ "Keep this under 500 words - brevity is critical here"
AIs are good at inference, but they're even better when you just tell them your preferences. It's not condescending - it's decisive.
Don't just show examples - explain WHY they're good or bad.
[Good competitive analysis]What makes this excellent:
- Every claim is quantified (not "they're growing fast" but "40% YoY growth")
- Uses primary sources (earnings calls, not secondary speculation)
- Clear narrative arc: market → competitors → our opportunity
- Bold recommendation in first paragraph
What's wrong with this:
- Vague claims without data ("seems promising")
- No clear point of view or recommendation
- Lists features without strategic insight
- Too long for executive consumption
Remove ambiguity by being explicit about triggers and anti-patterns.
**When to use this approach:** - When analyzing competitors with >$10M revenue - When we need quantitative data for board decisions - When market positioning is unclearWhen NOT to use this approach:
- For startups with <1 year of data
- When we just need directional trends
- For internal-only products with no competitors
For complex prompts with multiple components, embed decision criteria where they're relevant rather than all upfront.
Instead of: One big "When to use scripts vs. references vs. assets" section at the top
Do this:
### Scripts
When to include: When the same code is being rewritten repeatedly
Examples: rotate_pdf.py for PDF tasks
Benefits: Token efficient, deterministic
### References
When to include: For documentation to reference while working
Examples: schema.md for database schemas
Benefits: Keeps main instructions lean
Decision criteria appears exactly when thinking about that component.
Flag failure modes upfront to prevent predictable errors.
<common_mistakes>
- Don't just list competitor features - analyze strategic positioning
- Don't use secondary sources as primary - go to earnings calls and company blogs
- Don't hedge your recommendation - be decisive
- Don't make it longer than 3 pages - executives won't read it </common_mistakes>
When explaining complex structures (file hierarchies, data schemas, workflows), show the structure visually before explaining it.
Good pattern:
project/
├── src/
│ ├── components/
│ └── utils/
└── tests/
The src/ directory contains...
Why it works: The visual diagram acts as a mental anchor for the detailed explanation that follows.
Use this for: File structures, data relationships, process flowcharts, API hierarchies, component architectures
Questions in prompts serve two purposes:
1. Gathering information (standard) "What's the target audience for this report?"
2. Teaching judgment (advanced) "To understand what resources would help, ask yourself:
- What code gets rewritten repeatedly?
- What documentation needs frequent reference?
- What templates would save time?"
The second type models the thinking process to internalize. It's not asking the user - it's teaching the AI how to think.
Important: When using questions to gather info, avoid overwhelming by limiting scope:
- Ask 1-3 critical questions per message
- Start with the most important questions
- Follow up for details only after the basics are clear
- Bundle related questions together
For single-execution, context-specific requests.
Build conceptual understanding before getting into details:
Tell the AI what problem they're solving and why it matters.
Good: "I'm presenting to our board tomorrow to convince them we should invest in feature X. I need a persuasive analysis that shows market opportunity and competitive gaps."
Why it works: Now the AI knows it's writing for decision-makers, it needs to be persuasive (not just informative), and should emphasize opportunity/gaps.
What must be true for this to be excellent?
Good: "Must include 3+ quantitative metrics per competitor. Must make a clear recommendation. Should be scannable in 5 minutes by executives who won't read every word."
Why it works: Clear quality bar (3+ metrics), deliverable (clear recommendation), and audience constraint (executive scanning pattern).
Show what good looks like AND explain what makes it good. (See Universal Principles above)
Proactively address failure modes. (See Universal Principles above)
What MUST be true? What's off-limits?
Good:
- Must be under 3 pages
- Must use only public data (no proprietary sources)
- Must avoid mentioning our unannounced product roadmap
- Don't spend more than 5 tool calls on research
Why it works: Clear boundaries. No guessing about length, data sources, confidentiality, or scope of effort.
Only now do we get to mechanical details.
Good:
- Google Doc (not PDF)
- Executive summary at top
- Data tables in appendix
- Use our standard template: [link]
Sometimes the AI genuinely needs more information. The best prompts acknowledge this:
Good approach: "If anything's unclear, ask me - but try to bundle your questions so we can keep this efficient."
Even better: Anticipate the ambiguity upfront:
- "You might wonder whether to use REST or GraphQL - use REST, it's what the rest of our stack uses."
- "If you're unsure about the tone, aim for professional but not stuffy - think Bloomberg, not academic paper."
For teaching reusable tools, systems, or workflows.
What does this capability enable?
Example: "Use this tool when you need to ask the user questions during execution."
Capabilities need clear boundaries because they'll be used repeatedly in different contexts.
Example from TodoWrite:
When to Use:
- Complex multi-step tasks (3+ distinct steps)
- Non-trivial tasks requiring careful planning
- User explicitly requests todo list
- User provides multiple tasks
- After receiving new instructions
When NOT to Use:
- Single, straightforward task
- Trivial task providing no organizational benefit
- Task completable in <3 steps
- Purely conversational or informational
Why this matters: Without clear boundaries, the AI will either overuse or underuse the capability.
For capabilities, examples should show BOTH usage patterns AND the reasoning behind decisions.
User: "I want to add a dark mode toggle to the application settings. Make sure you run the tests and build when you're done!"AI: I'll help add a dark mode toggle to your application settings. Let me create a todo list to track this implementation.
Creates todo list with items:
- Creating dark mode toggle component in Settings page
- Adding dark mode state management
- Implementing CSS styles for dark theme
- Updating existing components to support theme switching
- Running tests and build process
Begins working on the first task
The AI used the todo list because: 1. Adding dark mode is a multi-step feature requiring UI, state management, and styling changes 2. The user explicitly requested tests and build be run afterward 3. The AI inferred that tests and build need to pass by adding it as the final task User: "Can you add a comment to the calculateTotal function?"AI: Sure, let me add a comment to the calculateTotal function.
Uses Edit tool to add comment
The AI did NOT use the todo list because: - This is a single, straightforward task - Adding a comment doesn't require tracking multiple steps - The todo list would provide no organizational benefitFor capabilities, include specific anti-patterns that have been observed:
<common_mistakes>
- Don't mark tasks as completed when tests are failing
- Don't have multiple tasks in_progress at once
- Don't create todo lists for trivial single-step tasks
- Don't forget to update task status in real-time </common_mistakes>
Only after conceptual understanding should you include technical specifications:
- Input/output formats
- State management rules
- Technical constraints
- API schemas
For knowledge acquisition before execution, particularly when concepts are outside the AI's training data or require deep understanding.
Use this when:
- The domain/concept is NOT in the AI's training data (e.g., new frameworks like MCP, emerging technologies)
- Company-specific knowledge (internal tools, proprietary systems, custom schemas)
- Specialized domain requiring deep understanding (legal frameworks, medical protocols, complex APIs)
Build conceptual understanding and mental models.
Structure:
- Core concepts and principles
- Why things work the way they do
- High-level mental models
- Load foundational documentation
Example from MCP-Builder:
### Phase 1: Deep Research and Planning
#### 1.1 Understand Agent-Centric Design Principles
Before diving into implementation, understand how to design tools for AI agents:
**Build for Workflows, Not Just API Endpoints:**
- Don't simply wrap existing API endpoints - build thoughtful workflow tools
- Consolidate related operations
- Focus on tools that enable complete tasks
**Optimize for Limited Context:**
- Agents have constrained context windows
- Return high-signal information, not data dumps
- Provide "concise" vs "detailed" response formats
[Additional principles...]
Load specific technical documentation and details.
Structure:
- API documentation
- Technical specifications
- Edge cases and constraints
- Best practices
Key Pattern - Progressive Context Loading: Don't load everything upfront. Use explicit instructions for just-in-time knowledge delivery:
**Load and read the following reference files:**
- [📋 MCP Best Practices](./reference/mcp_best_practices.md)
- For Python: Use WebFetch to load `https://github.com/.../python-sdk/README.md`
- For TypeScript: Use WebFetch to load `https://github.com/.../typescript-sdk/README.md`
Visual Aids for Scanning: Use functional emojis as a visual language:
- 🚀 = workflow/process
- 🐍 = Python-specific
- ⚡ = TypeScript-specific
- ✅ = evaluation/testing
- 📋 = documentation
This enables quick scanning to find relevant sections.
Apply learned knowledge to create an approach.
Structure:
- Create comprehensive implementation plan
- Identify tools/resources needed
- Design architecture based on learned principles
- Anticipate challenges
Example:
#### 1.6 Create a Comprehensive Implementation Plan
Based on your research, create a detailed plan:
**Tool Selection:**
- List the most valuable endpoints/operations to implement
- Prioritize tools for most common use cases
- Consider which tools work together
**Shared Utilities:**
- Identify common API request patterns
- Plan pagination helpers
- Design filtering and formatting utilities
[Additional planning sections...]
Implement using learned patterns and principles.
Structure:
- Step-by-step implementation guide
- Reference back to learned principles
- Use branching for different paths (Python vs. TypeScript)
Key Pattern - Explicit Knowledge Gates: Stop execution at points where additional context is needed:
#### 2.4 Follow Language-Specific Best Practices
**At this point, load the appropriate language guide:**
For Python: Load [🐍 Python Implementation Guide](./reference/python_guide.md)
For TypeScript: Load [⚡ TypeScript Implementation Guide](./reference/ts_guide.md)
Pattern - Escape Hatches: Give explicit permission to skip steps with clear conditions:
### Step 3: Initialize the Project
Skip this step only if the project already exists and you're iterating on it.
This trusts the AI to recognize when a step doesn't apply, but provides the condition to check.
Check implementation against learned standards.
Structure:
- Quality checklist
- Evaluation creation
- Testing against learned principles
Key Pattern - Deferred Detail: Keep main document lean by referencing detailed checklists:
#### 3.3 Use Quality Checklist
To verify implementation quality, load the appropriate checklist:
- Python: see "Quality Checklist" in [🐍 Python Guide](./reference/python_guide.md)
- TypeScript: see "Quality Checklist" in [⚡ TypeScript Guide](./reference/ts_guide.md)
Learning Journey prompts require explicit management of external resources.
Provide a consolidated view of ALL resources:
# Reference Files
## 📚 Documentation Library
Load these resources as needed during development:
### Core Documentation (Load First)
- **MCP Protocol**: Fetch from `https://protocol-url.com` - Complete specification
- [📋 Best Practices](./reference/best_practices.md) - Universal guidelines
### SDK Documentation (Load During Phase 1/2)
- **Python SDK**: Fetch from `https://github.com/python-sdk/README.md`
- **TypeScript SDK**: Fetch from `https://github.com/ts-sdk/README.md`
### Implementation Guides (Load During Phase 2)
- [🐍 Python Guide](./reference/python.md) - Complete Python guide with examples
- [⚡ TypeScript Guide](./reference/ts.md) - Complete TypeScript guide
### Evaluation Guide (Load During Phase 4)
- [✅ Evaluation Guide](./reference/eval.md) - Testing and validation
Explicitly state WHEN to load each resource:
- Phase 1: Core concepts and principles
- Phase 2: "At this point, load..." technical documentation
- Phase 3: Planning resources
- Phase 4: "Now load..." implementation guides
- Phase 5: "Finally, load..." evaluation resources
This respects context window constraints by loading knowledge just-in-time.
For learning journeys, include explicit warnings about common failure modes:
**Important:** MCP servers are long-running processes that wait for requests.
Running them directly will cause your process to hang indefinitely.
Safe ways to test:
- Use the evaluation harness (recommended)
- Run server in tmux to keep it outside main process
- Use a timeout: `timeout 5s python server.py`
These patterns make prompts worse across all three types:
❌ "A function in programming is a reusable block of code that..." ✅ Just ask for the function - the AI knows what functions are
❌ "If it's not too much trouble, maybe you could possibly..." ✅ "Write a function that..." - just tell them what you want
❌ "Please, if you don't mind, could you kindly..." ✅ "Write..." - be direct, AIs aren't offended by clarity
❌ "Consider the case where X is null, or Y is undefined, or Z is an empty array, or..." ✅ "Handle standard edge cases" - trust the AI to think through variations
❌ "Try your best!" "Do a great job!" "Be creative!" ✅ Just describe what good looks like - the AI is always trying its best
❌ "First, open the file. Then, read line 1. Then, read line 2..." ✅ "Extract the email addresses from this file" - the AI will figure out how
These patterns add noise without adding information. They make prompts longer and fuzzier without improving outputs. Be direct and trust the AI to handle the basics.After writing a prompt, ask yourself:
"Could a smart person execute this without asking me 10 questions OR robotically following steps that don't make sense?"
If yes, you've hit the sweet spot.
From Anthropic's guidance on Claude Code: "Claude can infer intent, but it can't read minds. Specificity leads to better alignment with expectations."
| Poor | Good | Why It's Better |
|---|---|---|
| add tests for foo.py | write a new test case for foo.py, covering the edge case where the user is logged out. avoid mocks | Specifies the exact edge case and a constraint (no mocks) |
| why does ExecutionFactory have such a weird api? | look through ExecutionFactory's git history and summarize how its api came to be | Gives a concrete approach and frames it as understanding evolution, not judgment |
| add a calendar widget | look at how existing widgets are implemented on the home page to understand the patterns and specifically how code and interfaces are separated out. HotDogWidget.php is a good example to start with. then, follow the pattern to implement a new calendar widget that lets the user select a month and paginate forwards/backwards to pick a year. Build from scratch without libraries other than the ones already used in the codebase. | Provides context (understand patterns), points to examples (HotDogWidget.php), specifies requirements (month selection + pagination), and sets constraints (no new libraries) |
- Do This - Execute this specific thing now
- Know How To Do This - Learn this capability for future use
- Learn This Domain First - Acquire domain knowledge before execution
- Scale complexity to judgment required
- Be decisively opinionated
- Show examples with reasoning
- Provide clear decision frameworks
- Address common mistakes proactively
- Use visual structure for complex anatomy
- Use questions to teach judgment
Task Prompts: Purpose → Success Criteria → Examples → Common Mistakes → Constraints → Format
Capability Prompts: Purpose → When to Use/Not Use → Examples with Reasoning → Common Mistakes → Mechanics
Learning Journey Prompts: Foundation → Detailed Study → Synthesis → Execution → Validation
- Progressive context loading
- Resource orchestration
- Explicit knowledge gates
- Escape hatches
Could someone smart execute this without 10 questions OR robotic step-following?