Skip to content

Instantly share code, notes, and snippets.

@johnlindquist
Created December 7, 2025 16:08
Show Gist options
  • Select an option

  • Save johnlindquist/1651e7f749742651426b6af21440e975 to your computer and use it in GitHub Desktop.

Select an option

Save johnlindquist/1651e7f749742651426b6af21440e975 to your computer and use it in GitHub Desktop.
Microsoft AI Dev Days Talk: The Art of Intelligent Interruption - Human-in-the-Loop AI Design

The Art of Intelligent Interruption

Designing AI Scripts That Know When to Ask for Help

Microsoft AI Dev Days | 15-20 minutes


Hook (1 minute)

"The best assistant isn't the one who does everything silently. It's the one who knows exactly when to tap you on the shoulder."

Opening Question: Have you ever had a junior developer who just... did things? Deployed to production without asking? Made architectural decisions while you were at lunch?

That's what most AI scripts do today. They either:

  • Ask permission for EVERYTHING (death by a thousand confirmations)
  • Ask permission for NOTHING (chaos mode)

Today's Promise: I'll show you how to build AI workflows that know the perfect moment to involve you.


Section 1: The Autonomy Spectrum (3 minutes)

The Problem with Binary Trust

Full Autonomy ←————————————→ Full Supervision
   "YOLO"                        "Are you sure?"
                                 "Are you REALLY sure?"

Neither extreme works:

  • Full autonomy: Catastrophic failures
  • Full supervision: Automation theater

The Real Question

Not "should AI be autonomous?" but "autonomous about WHAT?"

Key Insight: Different operations deserve different trust levels.

Trust Level Example Operations Human Involvement
High Read files, search code, analyze None
Medium Write tests, format code Summary only
Low Delete files, modify configs Explicit approval
Critical Deploy, publish, external APIs Interactive session

Section 2: Marker Patterns for Escalation (4 minutes)

Demo: The Escalation Signal System

Concept: AI outputs structured markers that outer scripts detect and act on.

<!-- ESCALATE:HUMAN_DECISION -->
I found 3 possible approaches to implement this feature:
1. Add a new API endpoint (2 hours, breaking change)
2. Extend existing endpoint (4 hours, backward compatible)
3. Use webhooks (1 hour, requires client changes)

Each has tradeoffs I can't evaluate without knowing your priorities.
<!-- /ESCALATE -->

The Outer Script Pattern

#!/bin/bash
# Fast agent runs first
result=$(ma quick-analysis.copilot.md --file "$1")

# Check for escalation markers
if echo "$result" | grep -q "ESCALATE:HUMAN_DECISION"; then
    echo "Agent needs your input:"
    echo "$result" | sed -n '/ESCALATE/,/\/ESCALATE/p'
    read -p "Your choice (1/2/3): " choice

    # Continue with enriched context
    ma continue-with-choice.claude.md --choice "$choice" --context "$result"
fi

Marker Types

Marker Meaning Script Response
ESCALATE:CONFIDENCE_LOW Model uncertain Escalate to stronger model
ESCALATE:HUMAN_DECISION Business logic choice Interactive prompt
ESCALATE:PERMISSION_NEEDED Dangerous operation Require explicit approval
ESCALATE:CONTEXT_MISSING Need more info Gather additional context

Section 3: The Permission Boundary Pattern (4 minutes)

Demo: Graduated Permissions

The Insight: Define what's allowed in the markdown frontmatter itself.

---
command: copilot
permissions:
  allow:
    - read:src/**
    - write:src/tests/**
  deny:
    - write:src/config/**
    - delete:*
  escalate:
    - write:src/api/**
    - modify:package.json
---

Analyze the codebase and suggest improvements.
Implement any test improvements directly.
For API changes, describe what you'd do and wait for approval.

The Trust Ladder

1. New Script → Maximum Restrictions
        ↓
2. Proven Script → Relaxed Read Access
        ↓
3. Trusted Script → Write to Safe Zones
        ↓
4. Verified Script → Escalation-only for Danger Zones

Demo: Permission Violations as Signals

# Script tries something outside its permissions
# Instead of failing silently or crashing:

echo "PERMISSION_BOUNDARY: write:src/config/database.ts"
echo "Script wants to modify database configuration."
echo "Reason: 'Optimize connection pool settings'"
echo ""
echo "Options:"
echo "  [a]llow once  [t]rust for session  [d]eny  [v]iew change"

Section 4: Intelligent Interruption in Practice (4 minutes)

The Three Questions

Before any AI action, the script asks itself:

  1. Do I have permission? (Check boundaries)
  2. Am I confident? (Check certainty markers)
  3. Is this reversible? (Check operation type)

Demo: The Confidence Cascade

---
command: copilot
confidence-threshold: 0.7
escalate-to: claude
---

Analyze this error log and suggest fixes.
If you're less than 70% confident, output ESCALATE:CONFIDENCE_LOW.

Outer script:

result=$(ma analyze-error.copilot.md --log "$1")

if echo "$result" | grep -q "CONFIDENCE_LOW"; then
    echo "Copilot uncertain. Escalating to Claude..."
    ma analyze-error.claude.md --log "$1" --prior-analysis "$result"
else
    echo "$result"
fi

The Golden Rule of Interruption

Interrupt for decisions, not for status updates.

Bad Interruption:

"I'm now analyzing file 3 of 47. Continue? [y/n]"

Good Interruption:

"I found a security vulnerability in auth.ts (line 42).
This exposes user tokens. Fix options:
1. Immediate patch (may break sessions)
2. Graceful migration (4-hour implementation)
Which approach? [1/2]"

Section 5: Real-World Pattern: The Review Loop (3 minutes)

Demo: PR Review with Intelligent Escalation

---
command: copilot
model: gpt-4o
---

Review this PR diff:
@!`git diff main...HEAD`

Output your review in this format:
- APPROVE: if changes are safe and well-tested
- COMMENT: for style/minor issues (list them)
- ESCALATE:SECURITY if you see potential security issues
- ESCALATE:ARCHITECTURE if changes affect system design
- REQUEST_CHANGES: for bugs or missing tests (list them)

The orchestrating script:

#!/bin/bash
review=$(ma pr-review.copilot.md)

case "$review" in
    *"ESCALATE:SECURITY"*)
        echo "Security concern detected. Getting senior review..."
        ma security-review.claude.md --context "$review"
        ;;
    *"ESCALATE:ARCHITECTURE"*)
        echo "Architecture change detected."
        echo "$review"
        read -p "Proceed with detailed review? [y/n]: " proceed
        [[ $proceed == "y" ]] && ma architecture-review.claude.md
        ;;
    *"APPROVE"*)
        gh pr review --approve
        ;;
    *)
        echo "$review"
        ;;
esac

Section 6: Building Trust Over Time (2 minutes)

The Trust Journal Pattern

# Every successful operation builds trust
echo "$(date): copilot:pr-review:SUCCESS" >> ~/.ma/trust-journal

# Track escalation accuracy
# Did the human agree with the escalation?
echo "$(date): copilot:ESCALATE:SECURITY:VALIDATED" >> ~/.ma/trust-journal

Progressive Autonomy

Week 1: Script asks about everything Week 4: Script handles routine, escalates edge cases Week 12: Script has earned autonomy for its domain

The Goal: Earned autonomy, not granted autonomy.


Closing: The Philosophy of Collaborative AI (1 minute)

Three Takeaways

  1. Design for the interrupt - The interrupt IS the interface
  2. Trust is granular - Not "trust AI" but "trust THIS AI for THIS operation"
  3. Scripts should be honest - Better to escalate than guess

The Vision

Future AI workflows aren't about removing humans.
They're about involving humans at exactly the right moment,
with exactly the right context,
for exactly the right decision.

Final Thought

"The best AI assistant is like a great executive assistant: handles the routine brilliantly, and knows EXACTLY which decisions only you can make."


Demo Summary

Demo Time Key Concept
Escalation Markers 2 min ESCALATE:* pattern
Permission Boundaries 2 min Frontmatter trust levels
Confidence Cascade 2 min Fast agent -> smart agent
PR Review Loop 3 min Real-world integration

Total Demo Time: ~9 minutes woven throughout


Resources

  • markdown-agent (ma): github.com/johnlindquist/markdown-agent
  • GitHub Copilot CLI: github.com/github/gh-copilot
  • Example patterns: (link to gist collection)

Appendix: Quick Reference Card

Marker Types

ESCALATE:HUMAN_DECISION    - Need human choice
ESCALATE:CONFIDENCE_LOW    - Model uncertain
ESCALATE:PERMISSION_NEEDED - Outside boundaries
ESCALATE:CONTEXT_MISSING   - Need more info
ESCALATE:SECURITY          - Security concern
ESCALATE:ARCHITECTURE      - Design decision

Permission Syntax

permissions:
  allow: [pattern...]    # Allowed operations
  deny: [pattern...]     # Blocked operations
  escalate: [pattern...] # Require approval

The Three Questions

  1. Do I have permission?
  2. Am I confident?
  3. Is this reversible?
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment