Skip to content

Instantly share code, notes, and snippets.

View paulbreuler's full-sized avatar
🤓
Learning something new

Paul Breuler paulbreuler

🤓
Learning something new
View GitHub Profile
flowchart TD
    A[create-feature-plan] --> B[plan.md + agents/*.agent.md created]
        B --> C[copy agent file to Claude]
            C --> D[agent implements]
                D --> E[PR merged]
                    E --> F["just work  auto-detect plan, assess status, suggest next task"]
                        F --> G{cleanup needed?}
                            G -->|yes| H["just heal (auto-fix completed agents)"]
Command Purpose
work Auto-detect plan from last PR, assess agent status, suggest next task
heal Auto-move completed agents to completed/, detect stuck agents
run-agent --auto Run next agent in auto-detected plan
assess-agents Check completion status across all agents
Phase Verbosity Purpose
Planning (plan.md) Verbose Figure things out, iterate, full specs
Execution (*.agent.md) Minimal Distilled context for agent, ~200-400 lines
Metric Count
Pull Requests Merged 14
Commits 100+
Lines Changed 30,000+
Files Changed 200+
Copilot Review Comments 200+
Gap Postman Finding runi Solution
Contract testing Only 17% do it Continuous drift detection against bound specs
Semantic versioning Only 26% use it Temporal awareness with version history & diffs
AI agent security 51% cite it as #1 concern AI verification validates LLM output before execution
Documentation scatter 55% struggle with inconsistency Single source of truth in Git-friendly YAML
API discovery 34% can't find existing APIs Semantic links map cross-API relationships
Security Concern % of Developers
Unauthorized/excessive API calls from AI agents 51%
AI systems accessing sensitive data 49%
AI systems leaking API credentials 46%
Testing Type Adoption
Functional testing 67%
Integration testing 67%
Performance testing 57%
Contract testing 17%
MCP Status % of Developers
Use MCP regularly 10%
Plan to explore it 24%
Used occasionally for experiments 19%
Evaluated but chose not to implement 7%
Not familiar with MCP 31%
Model Hallucination Rate Release
o3-mini 14.8% 2025
o1 16% 2024
o3 33% 2025
o4-mini 48% 2025