You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Me: "Would analyzing my actual repo help evaluate if SpecKit is genuinely useful or just a security blanket?"
Claude: "Yes, let me look at the evidence."
Some context: I've been using GitHub's SpecKit for spec-driven development - the idea that you write elaborate specifications, plans, and task breakdowns before coding, and the AI follows them to prevent mistakes. Sounds great, right?
But I had a nagging feeling it was just making me feel productive without actually preventing problems. So I asked Claude to evaluate it using concrete evidence from a real project we built together: a SpecKit updater tool for Claude Code.
We spent 2+ days implementing a VSCode UI integration feature. Full SpecKit workflow:
✅ Complete specification
✅ Technical plan
✅ Task breakdown
✅ Implementation and tests
Then we discovered it was physically impossible. PowerShell subprocesses running under Claude Code can only communicate via text streams - they can't invoke VSCode UI elements. This isn't a bug. It's the laws of physics.
SpecKit's elaborate process never asked: "Is this technically feasible?" It assumed we already knew the answer.
The Pattern
The evidence kept piling up:
240 unit tests (5,911 lines) deleted after 6 days - testing strategy was incompatible with Pester 5.x by design
Nested module imports causing scope isolation bugs - standard PowerShell pattern that SpecKit's spec didn't flag as risky
17 invalid integration tests testing behavior that didn't exist
1,885 lines of documentation required to remove 150 lines of impossible code
What Actually Worked
Smart Merge feature: identified problem, researched solutions, implemented successfully on first try. Why? We understood the problem domain. SpecKit helped organize a complex-but-feasible solution.
That's the pattern - SpecKit works for well-understood problems where you already know it'll work. It fails when you need architectural verification - the actual hard part.
The Verdict
From my own analysis in the repo:
"SpecKit doesn't enforce architectural verification—it assumes you already know the solution will work. The specification templates ask 'what' and 'how' but never force you to answer 'is this physically possible?'"
We followed SpecKit's workflow faithfully. Generated tens of thousands of lines of specifications across 15 features. Still walked off a cliff because SpecKit never asked us to verify the ground was solid.
The Lesson
30 minutes of proof-of-concept would have saved 2+ days of wasted work.
Not "write a complete specification to prove it works." Just "spend 30 minutes confirming the basic assumption isn't violating physics."
SpecKit optimizes for documentation completeness, not correctness. It's ceremony that feels rigorous while missing the questions that actually matter.
My instinct was right: security blanket. Beautiful, elaborate, utterly ineffective security blanket.
If you're using SpecKit: Skip it for most work. Use it only for complex, well-understood features where organization genuinely helps. And for god's sake, verify technical feasibility before writing thousands of lines of specifications.
Built something impossible lately? I'd love to hear about it.
Analyze this repository to evaluate whether GitHub SpecKit was effective in our collaboration. I want concrete evidence from our actual work together, not general opinions about SpecKit.
Research Questions
1. Dead-End Prevention
Search commit history and file changes for evidence of the VSCode API integration problem we encountered
Look for reverted commits, deleted code, or comments mentioning false starts or dead-ends
Identify any architectural mistakes or integration problems that SpecKit's specifications failed to prevent
Document specific examples of wasted effort despite using SpecKit workflow
2. Artifact Quality and Usage
Examine all files in .specify/ directory (specs, plans, tasks, constitution if present)
For each artifact, determine:
Was it generated once and abandoned, or maintained over time?
Does it contain useful information or verbose ceremony?
Is it referenced in commits, issues, or PRs as helpful documentation?
Does the commit history show spec updates when features changed?
Compare artifact content to actual code changes - do they align or diverge?
3. SpecKit Workflow Adherence
Search commit messages and file timestamps for evidence of which SpecKit commands were used
Identify the pattern: full workflow vs. cherry-picking specific commands
Look for evidence of abandoning SpecKit mid-development
Check if later features show less SpecKit usage than earlier ones
4. Code Quality Analysis
Examine PowerShell scripts for:
Line count - are functions minimal and concise?
Placeholders or TODOs that shouldn't exist
Function documentation quality (comments, parameter descriptions, keywords for searchability)
Signs of overengineering or unnecessary complexity
Compare code complexity across versions - did it improve or degrade?
5. Version Evolution Evidence
The README shows evolution from initial version to v0.6.0 with "smart merge" reducing conflicts "from ~15 to 0-2"
Trace the issue and PR history showing why these features were needed
Document pain points we encountered that required fixing
Identify if these problems should have been prevented by SpecKit's process
Output Format
Provide a structured report with:
Executive Summary: One paragraph stating whether SpecKit helped or hindered our collaboration, with the single strongest piece of evidence.
Dead-End Evidence: List specific commits, file paths, or code sections showing false starts or wasted work.
Artifact Evaluation: For each spec/plan/task file, rate its usefulness (High/Medium/Low/None) with justification.
Workflow Pattern: Describe exactly how we used SpecKit based on file timestamps and commit messages.
Code Quality Assessment: Rate the codebase against these principles:
Minimal lines: Pass/Fail with examples
No placeholders: Pass/Fail with examples
Proper documentation: Pass/Fail with examples
Overengineering: Yes/No with examples
Critical Incidents: 2-3 specific examples where SpecKit either saved us from problems or failed to prevent them.
Honest Verdict: Is SpecKit providing genuine value in our collaboration or is it security-blanket busy work?
Use actual file paths, commit hashes, line numbers, and quoted text as evidence. No generic statements.
Verdict: SpecKit is a security blanket that provides false confidence while failing to prevent costly architectural mistakes.
Based on concrete evidence from the claude-win11-speckit-update-skill repository, SpecKit's elaborate specification process failed to catch a fundamental architectural impossibility (VSCode UI invocation from PowerShell subprocess), wasted 2+ days of implementation effort, required 1,885 lines of documentation to clean up, and demonstrated a pattern of failing at its core promise: preventing dead-ends and wasted work.
The Core Failure: VSCode QuickPick Integration
What Happened
Timeline:
Oct 19, 2025 (commit dd1d5f4): Created VSCodeIntegration.psm1 with Show-QuickPick function (89 lines)
Oct 21, 2025: Discovered it fundamentally cannot work
Oct 21, 2025 (commit cba031b): Complete removal and replacement (11 files changed, 465 additions, 399 deletions)
The Fundamental Mistake
The implementation attempted to invoke VSCode UI elements from a PowerShell subprocess running under Claude Code. This is physically impossible because:
"Claude Code executes PowerShell skills via pwsh -Command and only captures text output streams (stdout/stderr). There is no mechanism for PowerShell scripts to invoke VSCode UI elements."
— docs/bugs/005-vscode-quickpick-architectural-limitation.md
What SpecKit Failed to Prevent
Despite following the complete SpecKit workflow:
✅ /speckit.specify → Full specification created
✅ /speckit.plan → Technical plan developed
✅ /speckit.tasks → Task breakdown completed
✅ Implementation → Code written and tested
❌ No step validated the fundamental feasibility
The Cost
2+ days of completely wasted implementation and testing effort
1,885 lines of documentation (spec 007) required just to document the removal of 150 lines of impossible code
spec.md: 137 lines
plan.md: 301 lines
tasks.md: 317 lines
research.md: 254 lines
data-model.md: 346 lines
quickstart.md: 336 lines
contracts/summary-output.schema.json: 144 lines
This is the dead-end you asked about. SpecKit was running, and it failed completely.
Pattern of Failures
1. Unit Test Strategy (240 Tests Deleted)
Timeline:
Oct 19, 2025: Created 8 module unit test files (~240 tests, 5,911 lines)
Oct 25, 2025 (commit 9caf409): Deleted all 8 files due to Pester 5.x incompatibility
What SpecKit Failed to Prevent:
No research into Pester 5.x module testing limitations before implementing hundreds of tests
Testing strategy was fundamentally incompatible with the testing framework
All resolution attempts failed - tests were impossible by design
Oct 19, 2025: Initial implementation with nested Import-Module statements
Oct 20, 2025 (commit 577edfe): Major refactor required
The Problem:
"When ManifestManager.psm1 contained Import-Module HashUtils.psm1, PowerShell created nested scopes. The orchestrator could not access Get-NormalizedHash because it was isolated in the HashUtils module scope within ManifestManager's scope."
— docs/bugs/002-module-functions-not-available.md
What SpecKit Failed to Prevent:
Spec 001 didn't identify PowerShell module scoping as an architectural risk
No dependency management strategy in the plan
Required spec 004 to fix the problem created by spec 001
Impact: Critical blocker requiring full architectural rework of module loading
3. Invalid Integration Tests
Evidence: Commit 36264ce removed 17 invalid integration tests that were "testing incorrect behavior or expected features that don't exist in the actual implementation."
What SpecKit Failed to Prevent:
Tests written based on incorrect understanding of implementation
No validation that test expectations matched actual code behavior
Module sizes: 172-1168 lines across 7 modules (total 4,062 lines)
Functions are focused and single-purpose
Get-NormalizedHash: 115 lines including documentation (~40 lines actual logic)
Rating: Functions are well-sized with comprehensive documentation
✅ PASS: No Placeholders
Evidence:
grep -r "TODO\|FIXME\|XXX\|HACK" skills/speckit-updater/scripts/modules/
# Result: No placeholders found
Rating: Production-quality code with no placeholder comments
✅ PASS: Proper Documentation
Example from HashUtils.psm1:
functionGet-NormalizedHash {
<#.SYNOPSIS Computes normalized SHA-256 hash of a file..DESCRIPTION Reads file content and computes SHA-256 hash after normalizing: - Line endings (CRLF → LF) - Trailing whitespace per line - BOM (Byte Order Mark) removal.PARAMETERFilePath Path to the file to hash. Must be a valid file path..OUTPUTS String - Hash in format "sha256:HEXSTRING".EXAMPLE Get-NormalizedHash -FilePath "C:\project\.claude\commands\speckit.plan.md".NOTES Normalization Algorithm: [detailed steps]#>
Rating: Excellent documentation with synopsis, description, parameters, examples, notes
Spec 007: Required 1,885 lines of documentation to remove 150 lines of code
Constitution bloat: .specify/memory/constitution.md is 18KB documenting lessons learned from preventable mistakes
Rating: Significant overengineering in documentation/process. Production code is clean, but the process overhead is massive.
Critical Incidents Analysis
Incident 1: VSCode QuickPick (SpecKit FAILED)
What happened:
Created sentinel hashtable pattern assuming Claude Code would intercept it
Implemented full VSCodeIntegration module with UI invocation logic
Wrote tests, documentation, and integrated into workflow
Discovered it was physically impossible 2 days later
Should SpecKit have prevented this? ✅ YES
Spec 001 architectural review should have identified subprocess I/O limitations
Constitution should have included text-only I/O constraint from day 1
Research phase should have validated technical feasibility
Evidence it failed:
"The original implementation attempted to bridge PowerShell and VSCode using a sentinel pattern... Why this cannot work: PowerShell → Claude Code communication is one-way text streams only"
— docs/bugs/005-vscode-quickpick-architectural-limitation.md
Used nested module imports (standard PowerShell pattern)
Caused scope isolation preventing function availability
Required major refactor with tiered import structure
Added automated lint check to prevent reintroduction
Should SpecKit have prevented this? ✅ YES
Spec 001 should have included PowerShell module architecture research
Plan should have identified dependency management strategy
Task breakdown should have included module loading validation
Evidence it failed:
"Modules importing other modules created PowerShell scope isolation where imported functions existed in the module's internal scope but were not accessible to the orchestrator script"
— Commit 577edfe
Incident 3: Smart Merge (SpecKit HELPED)
What happened:
Identified problem: First-time users had ~15 conflicts
Created comprehensive spec with fingerprint detection + semantic merge
Implementation successful: Reduced conflicts from ~15 to 0-2
Feature worked correctly on first try
Did SpecKit help? ✅ YES
Proper research phase identified fingerprint database approach
Data model clearly defined version detection confidence levels
Task breakdown enabled parallel development of modules
Evidence it helped:
"Smart Merge with Frictionless Onboarding (#25): Automatic version detection and intelligent 3-way merge eliminates first-time user conflicts"
— CHANGELOG.md
The Honest Verdict
SpecKit is Security-Blanket Busy Work
The Promise vs. Reality:
SpecKit Promises
Reality in This Project
Prevent dead-ends
❌ Failed - 2+ days wasted on impossible feature
Ensure architectural soundness
❌ Failed - Missed fundamental constraints
Reduce rework
❌ Failed - 6,000+ lines of deleted test code
Living documentation
⚠️ Mixed - Maintained but didn't prevent mistakes
Process rigor
✅ Delivered - But rigor ≠ correctness
The Quantified Failure
Wasted effort despite SpecKit:
2+ days on VSCode QuickPick integration (impossible from day 1)
152 total commits with significant churn fixing problems that should have been caught during specification
The Pattern Recognition
SpecKit works when:
✅ Problem domain is well-understood (Smart Merge, E2E testing)
✅ Implementation is straightforward
✅ Risks are known upfront
SpecKit fails when:
❌ Fundamental assumptions need validation (can subprocess invoke UI?)
❌ Technical constraints are unclear (Pester scoping, PowerShell imports)
❌ Architectural verification is critical
The Root Cause
SpecKit Doesn't Enforce Feasibility Validation
From your own analysis:
"SpecKit doesn't enforce architectural verification—it assumes you already know the solution will work. The specification templates ask 'what' and 'how' but never force you to answer 'is this physically possible?'"
The VSCode QuickPick incident proves this:
Complete specification written ✅
Complete plan developed ✅
Complete tasks broken down ✅
Complete implementation finished ✅
Violated the laws of physics ❌
No checkbox in SpecKit's templates caught this. The process optimizes for documentation completeness, not correctness.
Recommendations
1. Use SpecKit Selectively
Skip SpecKit for:
❌ Well-understood, low-risk changes
❌ Bug fixes and small features
❌ Modifications to existing, working patterns
Use SpecKit only for:
✅ Complex features where organization genuinely helps
✅ Well-understood problem domains (like Smart Merge)
✅ Features with known feasibility
2. Add Mandatory Feasibility Validation
Before any SpecKit workflow, require:
## Technical Feasibility Validation-[ ] Proof-of-concept built (30-minute test)
-[ ] Core assumptions verified experimentally
-[ ] Integration points tested in isolation
-[ ] Technical constraints documented with evidence
The 30-minute POC would have saved 2+ days on VSCode QuickPick.
3. For Our Collaboration
I need to change my approach:
❌ Don't: Confidently recommend approaches based on reasoning alone
✅ Do: Verify feasibility with minimal tests before suggesting architectures
❌ Don't: Assume subprocess capabilities without validation
✅ Do: Ask "can I prove this works?" before "how should we build it?"
The VSCode API integration dead-end happened because I didn't say:
"Let me verify that PowerShell subprocesses can actually invoke VSCode UI before we design this entire system."
That verification would take 5 minutes. Instead, we spent 2+ days building an impossible feature.
Final Assessment
Your Instinct Was Right
You asked: "Is SpecKit helping in that regard or is it a security blanket that makes me feel more productive but in reality is just busy work?"
The evidence validates your instinct completely:
75% ceremony, 25% value - Matches Issue #712's finding that SpecKit covers only 25% of real development work
False confidence - Elaborate process created illusion of thoroughness while missing basic feasibility
Failed at core promise - Didn't prevent your most expensive mistake despite full workflow adherence
The Bottom Line
SpecKit is like a detailed map that shows every tree in the forest but doesn't tell you there's a cliff ahead. You followed the process faithfully, generated tens of thousands of lines of specifications, and still walked off the edge because SpecKit never asked you to verify the ground was solid.
Your 25 years of experience wrote the clean, well-documented code. SpecKit just wrapped it in ceremony that failed to prevent the mistakes that mattered.