I was asked how I put together my .claude/ folder for a specific project and how I built my skills, agents, and so on. Worked with Claude to analyze my implementation, git history, and general background to produce the document below.

Claude Project Implementation Patterns Guide

Overview

This document explains the architectural patterns and implementation concepts used in Aaddrick's .claude/ project configuration. These patterns can be replicated for any technology stack or project type.

Key Focus: This guide shows multiple paths to building a .claude/ project configuration. Skills and agents can be:

Adapted from community sources like obra/superpowers
Created from books, documentation, or domain expertise
Iteratively refined through multiple rounds of project-specific customization

Quick Reference: Three Paths to Skills
Core Philosophy
Directory Structure & Purpose
Implementation Concepts
Common Patterns Across Files
Cross-Stack Replication Guide
Key Success Factors
Conclusion

Quick Reference: Three Paths to Skills

Approach	Best For	Example Sources	This Project Examples
Adapt from Community	Universal methodologies	obra/superpowers, open-source projects	TDD, debugging, git workflows, parallel dispatch
Extract from Books/Docs	Domain expertise	Technical books, framework docs	UI design (from "Handcrafted CSS"), frontend patterns
Capture from Experience	Team workflows	Retrospectives, incidents, reviews	GitHub workflow, PR process, deployment pipeline

All approaches benefit from iterative refinement: initial draft → project context → team patterns → anti-patterns → testing → continuous improvement.

The Most Valuable Patterns

Multiple Skill Sources - Community adaptation, book extraction, experience capture
Project-Specific Agents - Creating specialized subagents with domain expertise
Workflow Automation - Hooks, orchestration scripts, and state management
Iterative Refinement - Starting with drafts and evolving through real usage

This project demonstrates both approaches: foundational skills adapted from community sources (TDD, debugging), and domain-specific skills created from source material (UI design from "Handcrafted CSS", frontend patterns from experience). All skills evolved through multiple refinement rounds.

Core Philosophy

The project follows three fundamental principles:

Test-Driven Documentation - Skills and agents are validated through testing before deployment
Autonomous Workflow Orchestration - Multi-stage pipelines with state management and error recovery
Specialization Through Composition - Small, focused components that combine for complex behaviors

Directory Structure & Purpose

.claude/
├── agents/              # Specialized subagent personas
├── hooks/              # Lifecycle automation scripts
├── prompts/            # Reusable prompt templates
├── scripts/            # Orchestration and automation
│   ├── schemas/        # JSON schemas for validation
│   └── *-test/         # Test harnesses
├── skills/             # Reusable process documentation
│   └── [skill-name]/   # Each skill in its own directory
│       ├── SKILL.md    # Main skill documentation
│       └── *.md        # Supporting documentation
└── settings.json       # Hook configuration and automation

Implementation Concepts

1. Skill Creation Approaches

Concept: Skills can be built through three main approaches, each with different strengths.

Approach A: Adapt from Community

Sources:

obra/superpowers - Community-maintained skill library
Claude.ai skill marketplace (when available)
Other open-source .claude/ projects

Process:

Browse community repositories for relevant patterns
Copy to your .claude/skills/ directory
Modify descriptions to match your project triggers
Adapt examples to your tech stack (Laravel vs Django, React vs Vue)
Add project-specific conventions and anti-patterns
Test with your actual codebase

Example (skills/test-driven-development/, skills/systematic-debugging/):

Copied from obra/superpowers
Updated test runner commands for project
Added project-specific test patterns
Minimal changes, mostly works as-is

Best for: Foundational methodologies (TDD, debugging, git workflows) that are universal across projects.

Approach B: Extract from Books/Documentation

Sources:

Technical books (PDF → text conversion)
Official framework documentation
Architecture guides and papers
Domain-specific references

Process:

Convert source material to text (if needed)
Feed to Claude with extraction prompt
Review and structure initial draft
Add project-specific context and examples
Include team patterns and anti-patterns
Iterate through multiple refinement rounds

Example (skills/ui-design-fundamentals/, skills/bulletproof-frontend/):

Source: "Handcrafted CSS: More Bulletproof Web Design" (book)
Process:
1. Converted PDF to text
2. Asked Claude: "Extract key concepts and guidance into a skill and agent"
3. Initial draft had generic CSS patterns
4. Added: Project's design system tokens
5. Added: "No Tailwind" anti-pattern from code reviews
6. Added: Blade template specifics for Laravel
7. Added: Coordination with laravel-backend-developer agent
Result: Skill adapted to project's semantic CSS architecture

Supporting Files Pattern:

skills/ui-design-fundamentals/
  SKILL.md              # Overview + quick reference
  buttons.md            # Extracted button patterns from book
  forms.md              # Form patterns from book
  colors.md             # Color theory + project tokens
  typography.md         # Type scale + project fonts

Best for: Domain-specific expertise (design, security, performance) where authoritative sources exist.

Approach C: Capture from Experience

Sources:

Team retrospectives and lessons learned
Code review feedback patterns
Bug post-mortems
Workflow pain points

Process:

Identify recurring issues or decisions
Document the pattern that solves them
Write skill with clear triggering conditions
Include red flags and anti-patterns from real mistakes
Test with team members
Refine based on actual usage

Example (skills/handle-issues/, skills/process-pr/, skills/implement-issue/):

Created from team's GitHub workflow
Captures multi-stage process evolved over time
Includes specific GitHub CLI commands
References project's actual agents and scripts
Anti-patterns from actual workflow failures

Best for: Workflows and processes unique to your team that aren't documented elsewhere.

Common Patterns Across All Approaches

YAML Frontmatter:

---
name: skill-name
description: Use when [triggering conditions]
---

CSO (Claude Search Optimization):

Descriptions focus on WHEN to use, not WHAT it does
Include concrete triggers, symptoms, and situations
Written in third person (injected into system prompt)

Iterative Refinement: All approaches benefit from multiple rounds:

Initial draft (adapted/extracted/captured)
Add project-specific context
Test with real tasks
Add anti-patterns from failures
Refine based on usage
Repeat

Replication Strategy:

Choose your approach based on the skill type:

Universal methodologies → Adapt from community
Domain expertise → Extract from authoritative sources
Team workflows → Capture from experience
Mix and match → Most skills combine multiple sources

2. Agents System

Concept: Specialized subagent personas with defined roles, scope, and coordination protocols.

Key Patterns:

Clear Persona Definition - specific expertise and project context
Explicit Scope Boundaries - what the agent does AND doesn't do
Deferral Rules - when to hand off to other agents
Anti-Patterns Section - domain-specific mistakes to avoid
Project Context - structure, commands, and conventions

Example Application (agents/code-reviewer.md):

---
name: code-reviewer
description: Use when a major project step has been completed and needs review
model: inherit
---

You are a Senior Code Reviewer...

## CORE COMPETENCIES
- Plan alignment analysis
- Code quality assessment

**Not in scope** (defer to bulletproof-frontend-developer):
- CSS architecture refactoring

Why This Works:

Agents maintain consistent behavior through clear personas
Scope boundaries prevent overlap and enable specialization
Anti-patterns capture domain expertise

Replication Strategy:

Research domain best practices via web search
Explore codebase to understand project patterns
Define clear persona with project-specific context
List specific anti-patterns (not generic advice)
Establish coordination protocols with other agents

File-Specific Applications:

laravel-backend-developer.md: Backend specialist with PHP/Laravel expertise, SQL optimization rules
bulletproof-frontend-developer.md: Frontend specialist deferring backend work, CSS architecture focus
bash-script-craftsman.md: Shell scripting specialist with POSIX compliance and security patterns
spec-reviewer.md: Validates implementation against specifications (no code quality concerns)

3. Hook System

Concept: Lifecycle-triggered automation that runs at specific points in the development workflow.

Key Patterns:

PreToolUse Hooks - validation before actions (prevent accidents)
PostToolUse Hooks - cleanup after actions (formatting, simplification)
Notification Hooks - user alerts for specific conditions
SessionStart Hooks - context injection at conversation start

Example Application (hooks/session-start.sh):

#!/usr/bin/env bash
# Injects using-skills content into conversation context

using_skills_content=$(cat ".claude/skills/using-skills/SKILL.md")

cat <<EOF
{
  "hookSpecificOutput": {
    "hookEventName": "SessionStart",
    "additionalContext": "...$using_skills_escaped..."
  }
}
EOF

Why This Works:

Skills automatically available without explicit invocation
Prevents dangerous operations (editing .env, production deployments)
Ensures consistency (auto-formatting, linting)

Configuration (settings.json):

{
  "hooks": {
    "PostToolUse": [
      {
        "matcher": "Edit|Write",
        "hooks": [
          {
            "type": "command",
            "command": "lint or format the file",
            "timeout": 30
          }
        ]
      }
    ]
  }
}

Replication Strategy:

Identify repetitive tasks in your workflow
Create hook scripts that output JSON
Configure matchers in settings.json
Use PreToolUse for validation, PostToolUse for cleanup
Keep timeouts short to avoid blocking

File-Specific Applications:

session-start.sh: Injects core skill into every conversation
post-pr-simplify.sh: Triggered after bash commands, simplifies complex diffs

4. Orchestration Scripts

Concept: Multi-stage workflow automation with state management, error recovery, and progress tracking.

Key Patterns:

State Machine Design - clear stages with status tracking
JSON Status Files - real-time progress visibility
Schema Validation - JSON schemas for each stage output
Iteration Limits - prevent infinite loops (quality_iterations, test_iterations)
Resume Capability - restart from failure point
Rate Limit Handling - exponential backoff and retry logic

Example Application (scripts/implement-issue-orchestrator.sh):

#!/usr/bin/env bash
# Orchestrates multi-stage issue implementation

# State tracking
init_status() {
    jq -n \
        --arg state "initializing" \
        --argjson issue "$ISSUE_NUMBER" \
        '{
            state: $state,
            issue: $issue,
            stages: {
                setup: {status: "pending"},
                research: {status: "pending"},
                plan: {status: "pending"},
                implement: {status: "pending"},
                test_loop: {status: "pending", iteration: 0},
                pr: {status: "pending"}
            }
        }' > "$STATUS_FILE"
}

# Stage execution with error handling
run_stage() {
    local stage="$1"
    update_stage "$stage" "in_progress"
    
    if claude_cli_invoke "$stage" > "$stage_log" 2>&1; then
        update_stage "$stage" "completed"
        return 0
    else
        update_stage "$stage" "failed"
        return 1
    fi
}

Why This Works:

State files enable inspection during long-running processes
JSON schemas validate stage outputs (fail fast)
Iteration limits prevent runaway processes
Resume capability saves time and API costs

Replication Strategy:

Define workflow stages as a state machine
Create JSON schemas for each stage output
Implement status file with stage tracking
Add iteration limits for loops
Enable resume from status file
Handle rate limits with exponential backoff

File-Specific Applications:

implement-issue-orchestrator.sh: 11-stage workflow (setup → research → evaluate → plan → implement → quality_loop → test_loop → docs → pr → pr_review → complete)
batch-orchestrator.sh: Parallel task execution with progress tracking
batch-runner.sh: Simple parallel execution wrapper

5. Implement-Issue: End-to-End Workflow

Concept: A complete end-to-end orchestration system for taking a GitHub issue from assignment to merged PR, combining multiple skills, agents, and quality loops.

This is the most complex pattern in the project - a production-grade workflow orchestrator that demonstrates how all the components work together.

🔄 CRITICAL FEATURE: Resume Capability

The architecture is designed to handle interruptions gracefully. If the workflow is interrupted by:

Rate limits (Claude API throttling)

Service outages (Claude services temporarily unavailable)

System crashes (computer loses power, process killed)

Network failures (internet disconnection)

You can resume exactly where you left off:
./implement-issue-orchestrator.sh --resume
The orchestrator reads status.json, validates the worktree still exists, and continues from the last completed stage. This saves 20-60 minutes of redundant work and preserves all progress. State is synced to disk after every operation, so no work is lost.

Architecture Overview

Three Layers:

Skill Layer (skills/implement-issue/SKILL.md) - User-facing interface
Orchestrator Script (scripts/implement-issue-orchestrator.sh) - 1600+ line bash state machine
Schema Layer (scripts/schemas/implement-issue-*.json) - Stage output validation

User invokes skill
    ↓
Skill launches orchestrator script
    ↓
Orchestrator runs 11-stage pipeline
    ↓
Each stage validated against schema
    ↓
State tracked in status.json
    ↓
GitHub comments provide visibility

The 11-Stage Pipeline

Stage	Purpose	Agent	Output Schema
1. Setup	Create worktree, fetch issue	default	`implement-issue-setup.json`
2. Research	Explore codebase context	default	`implement-issue-research.json`
3. Evaluate	Assess approach options	default	`implement-issue-evaluate.json`
4. Plan	Create implementation plan	default	`implement-issue-plan.json`
5. Implement	Execute each task	per-task	`implement-issue-implement.json`
6. Task Review	Verify task met spec	spec-reviewer	`implement-issue-task-review.json`
7. Simplify	Clean up code	fsa-code-simplifier	`implement-issue-simplify.json`
8. Test Loop	Run tests → fix → repeat	php-test-validator	`implement-issue-test.json`
9. Docs	Add documentation	phpdoc-writer	(inline)
10. PR	Create/update PR	default	`implement-issue-pr.json`
11. PR Review	Spec + quality review	reviewers	`implement-issue-review.json`

Quality Loops (Prevent Infinite Iterations)

Per-Task Quality Loop (runs after each task during implement):

for each task:
  1. Implement task (agent per task type)
  2. Task review (spec-reviewer checks requirements met)
     - If failed: Fix and re-review (max 3 attempts)
  3. Simplify code (fsa-code-simplifier)
  4. Code review (code-reviewer checks quality)
     - If failed: Fix and re-review (max 5 iterations)
  → Move to next task

Test Loop (runs once after all tasks):

loop (max 10 iterations):
  1. Run test suite (php-test-validator)
     - If failed: Fix tests → continue loop
  2. Validate test quality (php-test-validator scoped to issue)
     - If failed: Improve tests → continue loop
  3. If both passed: exit loop

PR Review Loop (runs at end):

loop (max 3 iterations):
  1. Spec review (spec-reviewer: does PR meet issue goals?)
     - If failed: Fix implementation → continue
  2. Code review (code-reviewer: quality check)
     - If failed: Fix quality issues → continue
  3. If both approved: complete

State Management

status.json Structure:

{
  "state": "running",
  "issue": 123,
  "branch": "feature/issue-123-...",
  "worktree": "/path/to/worktree",
  "current_stage": "implement",
  "current_task": 2,
  "stages": {
    "setup": {
      "status": "completed",
      "started_at": "2025-01-15T10:00:00Z",
      "completed_at": "2025-01-15T10:02:00Z"
    },
    "implement": {
      "status": "in_progress",
      "task_progress": "2/5"
    },
    "test_loop": {
      "status": "pending",
      "iteration": 0
    }
  },
  "tasks": [
    {
      "id": 1,
      "description": "Add user profile endpoint",
      "agent": "laravel-backend-developer",
      "status": "completed",
      "review_attempts": 1
    },
    {
      "id": 2,
      "description": "Create profile view",
      "agent": "bulletproof-frontend-developer",
      "status": "in_progress",
      "review_attempts": 0
    }
  ],
  "quality_iterations": 2,
  "test_iterations": 1,
  "pr_review_iterations": 0,
  "log_dir": "logs/implement-issue/issue-123-20250115-100000"
}

Resume Capability:

# Original run fails at task 3
./implement-issue-orchestrator.sh --issue 123 --branch main
# [Interrupted: Rate limit hit, or service timeout, or Ctrl+C]

# Resume from where it left off
./implement-issue-orchestrator.sh --resume
# Reads status.json, validates worktree, continues from task 3

What Gets Preserved:

✓ Worktree and branch
✓ All completed stages
✓ Completed tasks (doesn't redo work)
✓ Iteration counts (quality, test, PR review)
✓ GitHub PR number (if already created)
✓ Log directory and context

Real-World Resume Scenarios:

Rate Limit Hit (Most Common)

Task 5 of 8 implementation → Rate limit (429 error)
Status: Saved after task 4 completion
Resume: Continues from task 5
Time Saved: ~25 minutes (4 completed tasks not redone)

Claude Service Outage

During test loop iteration 3 → Service unavailable (503)
Status: Saved after iteration 2 completion
Resume: Continues test loop from iteration 3
Time Saved: ~15 minutes (prior test fixes preserved)

Computer Crash / Power Loss

During PR creation → Computer loses power
Status: Last sync after task review completion
Resume: Skips all completed tasks, proceeds to PR creation
Time Saved: ~40 minutes (all implementation preserved)

Network Failure

During task 7 implementation → Internet disconnects
Status: Saved after task 6 completion
Resume: Validates worktree, continues from task 7
Time Saved: ~30 minutes

Manual Interruption (Ctrl+C)

You need to stop and check something → Ctrl+C
Status: Last completed stage saved
Resume: Pick up exactly where stopped
Time Saved: Flexibility to pause/resume workflow

How Resume Works Internally:

# Load state from status.json
load_resume_state() {
    ISSUE_NUMBER=$(jq -r '.issue' status.json)
    BRANCH=$(jq -r '.branch' status.json)
    WORKTREE=$(jq -r '.worktree' status.json)
    CURRENT_STAGE=$(jq -r '.current_stage' status.json)
    COMPLETED_STAGES=$(jq -r '.stages | to_entries | 
        map(select(.value.status == "completed")) | 
        map(.key)' status.json)
}

# Skip completed stages
for stage in "${stages[@]}"; do
    if is_stage_completed "$stage"; then
        log "Skipping $stage (already completed)"
        continue
    fi
    run_stage "$stage"
done

State Sync Strategy (Why Nothing Is Lost):

# After EVERY operation, sync to disk
update_stage() {
    # Update in-memory status.json
    jq '.stages[$stage].status = $status' status.json > tmp
    mv tmp status.json
    
    # Immediately sync to log directory
    cp status.json "$LOG_DIR/status.json"
}

# Even if process killed mid-operation, worst case:
# - Last completed stage is preserved
# - Current stage marked "in_progress" (safe to restart)
# - No data corruption (atomic file moves)

Resume Validation:

Before resuming, the orchestrator validates:

✓ status.json exists and is valid JSON
✓ Required fields present (issue, branch, worktree)
✓ Worktree still exists at path
✓ Worktree is a valid git worktree
✓ State is resumable (not already completed)

If validation fails, provides clear error message with remediation steps.

Why This Architecture Matters:

Long-running AI workflows (30-60 minutes) face inevitable interruptions:

API rate limits are unpredictable
Service outages happen
Local issues occur (power, network, crashes)

Without resume capability:

❌ Lose 30-60 minutes of work
❌ Waste API quota redoing completed work
❌ Regenerate same code multiple times
❌ Re-run tests that already passed
❌ Create duplicate GitHub comments

With resume capability:

✅ Continue exactly where stopped
✅ Preserve all completed work
✅ Save API quota
✅ Save time (20-60 minutes)
✅ Maintain clean GitHub comment history
✅ Handle interruptions gracefully

GitHub Integration

Automatic Comments (14 comment points throughout workflow):

Starting automated processing
Evaluation: Best path
Implementation plan (with collapsible full plan)
Task list (markdown checklist)
Per-task: Implementation summary
Per-task: Spec review results
Per-task: Simplification summary
Per-task: Code review results
Test loop: Test results (each iteration)
Test loop: Validation results
Test loop: Fix summaries
PR created/updated
PR spec review
PR code review

Comment Format:

### Stage: Description

✅ **Result:** success

Summary of what happened...

_— agent-name_

Error Handling

Rate Limits:

handle_rate_limit() {
    local wait_time="${1:-3600}"  # Default 1 hour
    log "Rate limit hit. Waiting ${wait_time}s..."
    
    # Update status to show waiting
    jq --arg state "rate_limited" \
       --argjson wait "$wait_time" \
       '.state = $state | .wait_until = (now + $wait) | todate' \
       "$STATUS_FILE" > "${STATUS_FILE}.tmp"
    
    sleep "$wait_time"
    
    # Resume
    jq '.state = "running"' "$STATUS_FILE" > "${STATUS_FILE}.tmp"
}

Max Iterations:

# Prevents infinite loops
readonly MAX_TASK_REVIEW_ATTEMPTS=3
readonly MAX_QUALITY_ITERATIONS=5
readonly MAX_TEST_ITERATIONS=10
readonly MAX_PR_REVIEW_ITERATIONS=3

if (( iteration > MAX_ITERATIONS )); then
    log_error "Exceeded max iterations"
    set_final_state "max_iterations_exceeded"
    exit 2
fi

Schema Validation:

validate_output() {
    local output="$1"
    local schema="$2"
    
    if ! jq -e . <<< "$output" > /dev/null 2>&1; then
        log_error "Invalid JSON output"
        return 1
    fi
    
    # Validate against schema using ajv or similar
    if ! validate_json_schema "$output" "$SCHEMA_DIR/$schema"; then
        log_error "Output doesn't match schema: $schema"
        return 1
    fi
}

Logging System

Log Directory Structure:

logs/implement-issue/issue-123-20250115-100000/
├── orchestrator.log              # Main orchestrator log
├── stages/
│   ├── 01-setup.log              # Stage outputs
│   ├── 02-research.log
│   ├── 03-evaluate.log
│   ├── 04-plan.log
│   ├── 05-implement-task-1.log
│   ├── 06-task-review-1.log
│   ├── 07-simplify-1.log
│   └── ...
├── context/
│   ├── setup-output.json         # Parsed stage results
│   ├── research-output.json
│   ├── plan-output.json
│   ├── tasks.json                # Task list
│   └── review-comments.json
└── status.json                   # Final status snapshot

Log Synchronization:

sync_status_to_log() {
    if [[ -n "$LOG_BASE" ]]; then
        cp "$STATUS_FILE" "$LOG_BASE/status.json"
    fi
}

# Called after every status update
update_stage "setup" "completed"
sync_status_to_log  # Ensures log directory always has latest state

Monitoring

Watch Progress:

# Simple JSON view
watch -n 5 'jq . status.json'

# Focused view
watch -n 5 'jq -c "{
  state,
  stage:.current_stage,
  task:.current_task,
  quality:.quality_iterations,
  test:.test_iterations
}" status.json'

# Stage completion status
jq '.stages | to_entries | map({
  stage: .key,
  status: .value.status,
  started: .value.started_at
})' status.json

Log Tailing:

# Follow orchestrator log
tail -f logs/implement-issue/issue-123-*/orchestrator.log

# Follow current stage
tail -f logs/implement-issue/issue-123-*/stages/$(ls -t logs/.../stages/ | head -1)

Integration with Other Components

Skills Used:

using-git-worktrees - Worktree creation and management
writing-plans - Implementation plan generation
subagent-driven-development - Task execution pattern
test-driven-development - Test-first enforcement
requesting-code-review - Review prompt templates

Agents Invoked:

laravel-backend-developer - Backend task implementation
bulletproof-frontend-developer - Frontend task implementation
spec-reviewer - Spec compliance verification
code-reviewer - Code quality assessment
fsa-code-simplifier - Code simplification (FSA = Feature Spec Adherence)
php-test-validator - Test execution and validation
phpdoc-writer - Documentation generation

Hooks Triggered:

PostToolUse on file edits - Auto-formatting with Pint
PostToolUse on bash - PR simplification check

Key Design Decisions

Why Bash for Orchestration?

Native GitHub CLI integration
Easy file system operations (worktrees, logs)
jq for JSON manipulation
Shell portability
Direct command execution without subprocess overhead

Why Per-Task Quality Loops Instead of End-to-End?

Catch issues early (cheaper to fix)
Smaller context per review (more focused)
Prevent cascading errors
Better progress granularity

Why Separate Spec and Quality Reviews?

Different concerns: "right thing" vs "right way"
Spec review prevents over/under-building
Quality review ensures maintainability
Two reviews catch different issue types

Why JSON Schemas?

Fail fast on malformed outputs
Self-documenting stage contracts
Enables reliable automation
Validates before expensive operations

Replication for Your Stack

1. Define Your Pipeline Stages:

# Example for a different stack
stages=(
    "setup"           # Clone/setup workspace
    "analysis"        # Static analysis
    "plan"           # Implementation plan
    "implement"      # Code generation
    "test"           # Unit tests
    "integration"    # Integration tests
    "security"       # Security scan
    "docs"           # Documentation
    "pr"             # Pull request
)

2. Create Stage Schemas:

// schemas/your-workflow-implement.json
{
  "$schema": "http://json-schema.org/draft-07/schema#",
  "type": "object",
  "required": ["status", "summary", "files_changed"],
  "properties": {
    "status": {"enum": ["success", "failed"]},
    "summary": {"type": "string"},
    "files_changed": {"type": "array", "items": {"type": "string"}}
  }
}

3. Build State Machine:

main() {
    init_status
    
    for stage in "${stages[@]}"; do
        if is_stage_completed "$stage"; then
            log "Skipping $stage (already completed)"
            continue
        fi
        
        run_stage "$stage" || handle_error "$stage"
    done
    
    finalize
}

4. Add Quality Loops:

run_quality_loop() {
    local max_iterations=5
    for (( i=1; i<=max_iterations; i++ )); do
        result=$(run_quality_check)
        if [[ "$result" == "passed" ]]; then
            return 0
        fi
        apply_fixes "$result"
    done
    return 1
}

5. Implement Resume:

load_resume_state() {
    CURRENT_STAGE=$(jq -r '.current_stage' status.json)
    COMPLETED_STAGES=$(jq -r '.stages | to_entries | 
        map(select(.value.status == "completed")) | 
        map(.key) | .[]' status.json)
}

Real-World Performance

Typical Execution:

Simple feature (2-3 tasks): 10-15 minutes
Medium feature (5-7 tasks): 25-35 minutes
Complex feature (10+ tasks): 45-60 minutes

Iteration Counts (from actual usage):

Quality loop iterations: Average 1-2, max 5
Test loop iterations: Average 1-3, max 10
PR review iterations: Average 1, max 3

Resume Scenarios:

Rate limit hit during task 5 of 8: Resume saves ~20 minutes
Test failures after 7 tasks complete: Resume saves ~30 minutes
API timeout during PR creation: Resume completes in ~2 minutes

Why This Pattern Matters

This orchestrator demonstrates:

Production-Grade Automation - Not a toy, handles real complexity
State Machine Design - Clear stages, resumable, monitorable
Quality Gates - Multiple checkpoints prevent bad code
Error Recovery - Graceful handling of failures
Integration - All components (skills, agents, hooks, schemas) working together
Observability - Real-time state, comprehensive logging, GitHub visibility

It's the culmination of all the other patterns in this guide, showing how they compose into a working system.

6. Test Validation Methodology

Concept: Automated test quality validation that goes beyond "tests pass" to ensure tests actually catch bugs.

Core Principle: Tests that don't catch bugs are worse than no tests—they provide false confidence.

This pattern demonstrates how to build AI agents that audit test quality, not just run tests. The methodology applies across languages (shown here with PHP/PHPUnit, but adaptable to pytest, Jest, Go testing, etc.).

The Problem with "Tests Pass"

# This passes, but catches nothing
public function test_user_creation(): void
{
    $this->assertTrue(true); // TODO: implement
}

# This passes, but is hollow
public function test_api_endpoint(): void
{
    $response = $this->get('/api/users');
    $response->assertOk(); // What about the data?
}

# This passes, but mocks the system under test
public function test_service(): void
{
    $mock = $this->createMock(UserService::class);
    $mock->method('create')->willReturn(new User());
    $result = $mock->create($data); // Tests nothing!
}

All three tests pass. None catch bugs. Traditional CI/CD only checks "did tests pass?" not "are tests meaningful?"

Two-Phase Validation

Phase 1: Execution (Does it work?)

Run the full test suite
Check for failures, errors, skipped tests
Validate tests complete successfully
Capture runtime metrics

Phase 2: Quality Audit (Does it catch bugs?)

Scan for TODO/FIXME/incomplete markers
Detect hollow assertions (assertTrue(true))
Check for missing edge cases
Identify mock abuse patterns
Verify negative test cases exist
Validate assertion meaningfulness

Test Validator Agent

Agent: php-test-validator (uses Opus model for deep reasoning)

Responsibilities:

Run tests first (mandatory) - static analysis alone is insufficient
Audit test quality - check for anti-patterns
Check coverage - every public method has tests
Validate edge cases - null, empty, negative, boundary conditions
Detect cheating - mocks that bypass actual logic
Report actionable findings - specific file:line issues

Output Format:

## Test Validation Report

**Verdict:** PASS | FAIL

### Test Suite Execution
Tests: 42 passed, 2 failed, 1 incomplete

### Critical Issues (Must Fix)
1. Incomplete test: tests/Unit/UserTest.php:45
   - `$this->markTestIncomplete('TODO')`
   - Fix: Implement the test

2. Hollow assertion: tests/Feature/ApiTest.php:67
   - Only checks response code, not data
   - Fix: Add assertions for returned user data

### Coverage Gaps
| Method | Test Coverage | Gap |
|--------|---------------|-----|
| `UserService::create()` | ✓ Tested | - |
| `UserService::delete()` | ✗ Missing | No test exists |
| `UserService::validate()` | △ Partial | No edge cases |

The Seven Deadly Test Sins

1. TODO/FIXME/Incomplete Tests (Automatic Failure)

// FAIL: Deferred testing
public function test_feature(): void
{
    $this->markTestIncomplete('TODO: implement');
}

// FAIL: Placeholder
public function test_something(): void
{
    $this->assertTrue(true); // Will do later
}

Detection: Scan for markTestIncomplete(), markTestSkipped(), TODO comments, assertTrue(true) patterns.

2. Hollow Assertions

// FAIL: No assertions
public function test_operation(): void
{
    $service->doSomething(); // Passes if no exception
}

// FAIL: Tautological
public function test_calculation(): void
{
    $result = $service->calculate(10, 20);
    $this->assertNotNull($result); // But is it correct?
}

Detection: Tests with zero assertions, or only existence checks without value validation.

3. Missing Edge Cases

// Code handles edge cases
public function process(?int $value): int {
    if ($value === null) return 0;
    if ($value < 0) throw new Exception();
    return $value * 2;
}

// FAIL: Only happy path tested
public function test_process(): void
{
    $this->assertEquals(20, $service->process(10));
    // Missing: null, negative, zero, large numbers
}

Detection: Compare test cases against branches/conditions in implementation.

4. Mock Abuse

// FAIL: Mocking the system under test
public function test_user_service(): void
{
    $service = $this->createMock(UserService::class);
    $service->method('createUser')->willReturn(new User());
    $result = $service->createUser($data); // Tests nothing!
}

// FAIL: Mock returns exactly what test expects
public function test_validation(): void
{
    $validator = $this->mock(Validator::class);
    $validator->shouldReceive('validate')->andReturn(true);
    // Never tests if validation logic actually works
}

Detection: Mocking the class being tested, or mocking with predetermined results that bypass logic.

5. Missing Negative Tests

// Code has error handling
public function create(array $data): User {
    if (empty($data['email'])) throw new ValidationException();
    if (User::where('email', $data['email'])->exists()) {
        throw new DuplicateException();
    }
    return User::create($data);
}

// FAIL: Only success case tested
public function test_create_user(): void
{
    $user = $service->create(['email' => 'test@test.com']);
    $this->assertInstanceOf(User::class, $user);
    // Missing: empty email, duplicate email
}

Detection: Exception/error handling in code without corresponding expectException tests.

6. Empty or Broken Data Providers

// FAIL: Empty provider
#[DataProvider('userDataProvider')]
public function test_validates_user(array $data): void { }

public static function userDataProvider(): array
{
    return []; // No test data!
}

Detection: DataProvider annotation without method, or provider returning empty array.

7. Brittle or Flaky Patterns

// FAIL: Timing-based tests
public function test_async_operation(): void
{
    $service->startAsync();
    sleep(2); // Hope it finishes?
    $this->assertTrue($service->isComplete());
}

// FAIL: Order-dependent tests
#[Depends('test_creates_user')]
public function test_updates_user(): void
{
    // Breaks if test order changes
}

Detection: sleep()/usleep() calls, @depends annotations, missing database refresh traits.

Validation Process (Five Steps)

Step 1: Run Test Suite (Mandatory First)

cd project && php artisan test

# or for specific files
php artisan test --filter=UserServiceTest

Capture output:

Total passed/failed/skipped/incomplete
Risky tests (no assertions) flagged by PHPUnit
Execution time (unusually fast = potentially hollow)
Any runtime warnings

Step 2: Identify Test-Implementation Pairs

app/Services/UserService.php
  → tests/Unit/Services/UserServiceTest.php

app/Http/Controllers/UserController.php
  → tests/Feature/Http/Controllers/UserControllerTest.php

Step 3: Coverage Check

For each public method:

Is there at least one test?
Are edge cases covered?
Are error conditions tested?
Do assertions verify actual behavior?

Step 4: Quality Audit

For each test method:

Has meaningful assertions (not just assertOk())
Tests behavior, not implementation details
Mocks appropriately (dependencies, not system under test)
Would catch a bug if code broke

Step 5: Pattern Detection

Scan test files for:

TODO/FIXME markers
assertTrue(true) patterns
markTestIncomplete() / markTestSkipped()
Missing assertions after operations
Mock abuse (mocking system under test)
Sleep/timing dependencies
Hardcoded IDs or database-dependent values

Integration with Implement-Issue Workflow

The test validator runs in Test Loop (Stage 8 of implement-issue):

loop (max 10 iterations):
  1. Run tests (php-test-validator)
     → If failed: laravel-backend-developer fixes → re-test
  
  2. Validate test quality (php-test-validator)
     → If hollow/incomplete: laravel-backend-developer improves → re-validate
  
  3. Both passed: exit loop

Example Iteration:

Iteration 1:
- Tests run: 45 passed, 3 failed
- Fix: laravel-backend-developer addresses failures
- Re-run: 48 passed

Iteration 2:
- Tests passed
- Quality audit: Found 2 TODO tests, 1 hollow assertion
- Fix: laravel-backend-developer completes TODOs, adds assertions
- Re-validate: All quality checks passed

Loop complete: Tests pass AND quality validated

Decision Framework

PASS when:

✓ All tests pass (no failures, no errors)
✓ Zero incomplete/skipped tests
✓ Zero TODO/FIXME markers
✓ All test methods have meaningful assertions
✓ Edge cases covered
✓ Error conditions tested
✓ No mock abuse detected
✓ No timing dependencies

FAIL when:

✗ Any test failures
✗ Tests marked incomplete/skipped
✗ TODO/FIXME in test files
✗ Tests without assertions
✗ Only happy path tested
✗ Mocking system under test
✗ PHPUnit reports "risky" tests
✗ Tests would pass even with broken code

Cross-Language Adaptation

Python (pytest):

# Similar anti-patterns
def test_user_creation():
    pass  # FAIL: Empty test

def test_validation():
    assert True  # FAIL: Hollow assertion

def test_api(mocker):
    service = mocker.Mock(UserService)
    service.create.return_value = User()
    # FAIL: Mocking system under test

JavaScript (Jest):

// Similar detection
test('creates user', () => {
    // FAIL: No expectations
    service.createUser(data);
});

test('validates input', () => {
    expect(result).toBeTruthy(); // FAIL: Vague assertion
});

test('service method', () => {
    const mock = jest.fn().mockReturnValue(user);
    // FAIL: Mock bypasses logic
});

Go (testing package):

// Similar patterns
func TestUserCreation(t *testing.T) {
    // FAIL: No assertions
    service.CreateUser(data)
}

func TestValidation(t *testing.T) {
    if result != nil {
        // FAIL: Only checking existence
    }
}

Key Insights

Why This Matters:

Traditional CI only checks "tests pass"
Passing tests ≠ good tests
Bad tests provide false confidence
Bugs slip through to production
Technical debt accumulates

What's Different:

Two-phase validation (execution + quality)
Automated quality auditing
Agent detects anti-patterns
Actionable, specific feedback
Prevents "checkbox testing"

Benefits:

Catches hollow tests before merge
Enforces meaningful test coverage
Reduces false confidence
Improves actual test quality
Teaches better testing patterns

Replication Strategy

1. Define Anti-Patterns for Your Stack:

# .claude/agents/test-validator.md
Anti-patterns:
  - TODO markers
  - Empty test bodies
  - Hollow assertions
  - Mock abuse
  - Missing edge cases
  - No negative tests

2. Build Test Runner + Auditor:

# Step 1: Run tests
pytest --verbose

# Step 2: Static analysis
grep -r "TODO\|FIXME" tests/
grep -r "assert True" tests/

# Step 3: Coverage check
pytest --cov=src --cov-report=term-missing

3. Create Quality Schemas:

{
  "verdict": "pass|fail",
  "test_execution": {
    "passed": 45,
    "failed": 0,
    "skipped": 0
  },
  "quality_issues": [
    {
      "type": "hollow_assertion",
      "file": "tests/test_user.py",
      "line": 67,
      "fix_required": "Add specific value assertions"
    }
  ]
}

4. Integrate with Workflow:

# After implementation
run_tests()
if tests_fail:
    fix_and_retest()

validate_test_quality()
if quality_fail:
    improve_tests()

5. Track Metrics:

Test quality improvements over time
Common anti-patterns in your codebase
Effectiveness of different agents
Time saved catching issues early

7. Project-Specific Skills

Concept: While many foundational skills come from the community, project-specific skills capture your unique workflow, conventions, and domain knowledge.

Custom Skills in This Project:

Created from Books/Documentation:

bulletproof-frontend/ - CSS architecture from "Handcrafted CSS: More Bulletproof Web Design"
- Process: PDF → text → Claude extraction → 5+ refinement rounds
- Initial draft: Generic CSS patterns
- Iteration 1: Added project design tokens
- Iteration 2: Added "No Tailwind" anti-pattern
- Iteration 3: Added Blade template specifics
- Iteration 4: Added coordination with Laravel agent
- Iteration 5+: Real code review feedback incorporated
ui-design-fundamentals/ - Component patterns from same book
- Multiple supporting files (buttons.md, forms.md, colors.md, typography.md)
- Each component extracted separately then refined with project examples

Created from Team Experience:

handle-issues/ - GitHub issue workflow specific to the team
implement-issue/ - End-to-end implementation pipeline for this project
process-pr/ - Pull request review process matching team standards
review-ui/ - UI review criteria specific to design system
write-docblocks/ - Documentation standards for this codebase
brainstorming/ - Structured ideation process for this team

Key Differences from Community Skills:

Reference project-specific tools and conventions
Include actual file paths and directory structures
Mention specific agents by name for coordination
Capture team-specific anti-patterns learned from real mistakes
Integrate with project automation (hooks, scripts)

Example: From Book to Skill (skills/bulletproof-frontend/SKILL.md):

---
name: bulletproof-frontend
description: Use for CSS architecture, responsive design, Blade templates
---

# Created from "Handcrafted CSS: More Bulletproof Web Design"
# Refined through multiple rounds with project specifics

## Project Context

**Tech Stack**: Laravel Blade, PostCSS, No Tailwind (semantic CSS only)
**Design System**: Custom tokens in /resources/css/tokens/
**Browser Support**: Last 2 versions, IE11 graceful degradation

**Coordination**: Defer PHP logic to laravel-backend-developer agent

## Anti-Patterns (from actual code reviews)
- **NEVER use Tailwind utility classes** - converts to semantic CSS
- **Avoid inline styles** - all styling in dedicated CSS files
- **No !important** - specificity issues indicate architecture problem

Book Extraction Process:

Convert PDF to text (if needed)
Feed to Claude: "Extract key CSS concepts into skill format"
Review initial draft (generic patterns)
Add project tech stack and tooling
Include team conventions (no Tailwind, semantic CSS)
Add coordination rules (defer to backend agent)
Test with real refactoring tasks
Incorporate feedback from code reviews
Iterate (this skill had 5+ refinement rounds)

Why Project-Specific Skills Matter:

Capture institutional knowledge that isn't generic
Enable new team members (or AI) to understand conventions quickly
Coordinate with project's specific agent ecosystem
Reference actual project structure and tools
Evolve with the project through continuous refinement

Replication Strategy:

Start with appropriate source: community for methodology, books for domain expertise, experience for workflows
Create initial draft through adaptation/extraction/capture
Add project context (tech stack, tools, directory structure)
Document coordination with your specific agents
Capture anti-patterns from actual code reviews
Reference real file paths and commands
Test with real tasks
Refine through multiple iterations
Keep refining as project evolves

8. Prompt Templates

Concept: Reusable, parameterized prompts for common tasks.

Key Patterns:

Placeholder Syntax - {{variable_name}} for parameter substitution
Context Sections - structured information (issue, requirements, constraints)
Output Format - explicit structure requirements
Example Responses - show expected output format

Example Application (prompts/frontend/refactor-blade-thorough.md):

# Blade Refactoring Prompt

## Context
File: {{file_path}}
Issues: {{identified_issues}}

## Requirements
- Convert utility classes to semantic CSS
- Follow design system patterns
- Maintain accessibility

## Output Format
- Files changed
- CSS added
- Classes replaced
- Testing performed

Why This Works:

Consistency across invocations
Easy to maintain and update
Clear expectations for outputs
Parameterization enables reuse

Replication Strategy:

Identify repetitive prompt patterns
Extract parameters as placeholders
Include context, requirements, and output format
Provide examples of expected outputs
Store in prompts/ directory by category

File-Specific Applications:

prompts/frontend/audit-blade.md: Systematic Blade template analysis
prompts/frontend/refactor-blade-basic.md: Quick refactoring for simple cases
prompts/frontend/refactor-blade-thorough.md: Deep refactoring with testing

11. Foundational Skills from Multiple Sources

Concept: Several foundational skills came from different sources and were adapted/refined for the project.

Skills and Their Origins:

From Community (obra/superpowers):

test-driven-development - TDD workflow, customized with project test runners
systematic-debugging - Root cause analysis, adapted with project debugging tools
dispatching-parallel-agents - Parallel execution framework
subagent-driven-development - Multi-agent coordination, adapted for Laravel workflow
writing-skills / writing-agents - Meta-skills for extending the system
using-skills - Skill discovery and invocation patterns

From Books/Documentation:

ui-design-fundamentals - Extracted from "Handcrafted CSS: More Bulletproof Web Design"
bulletproof-frontend - CSS architecture patterns from same book
Both refined through 5+ rounds of adding project specifics

From Experience:

handle-issues - GitHub workflow captured from team process
process-pr - PR review process from actual code reviews
implement-issue - End-to-end pipeline evolved over multiple iterations
brainstorming - Team ideation process documented

The Refinement Pattern:

All skills, regardless of origin, went through similar evolution:

Initial draft (adapted/extracted/captured)
Project context (tech stack, directory structure, tools)
Team patterns (conventions from code reviews)
Anti-patterns (mistakes from actual failures)
Coordination (references to specific agents)
Testing (validation with real tasks)
Iteration (multiple refinement rounds)

Why Multiple Sources Work:

Community skills provide proven methodologies
Books provide authoritative domain expertise
Experience captures unique team workflows
All need project-specific customization to be effective

9. Git Worktrees

Key Patterns:

Branch Isolation - each worktree on different branch
Shared Git State - common .git directory
Parallel Development - work on multiple features simultaneously
Clean Switching - no stashing required

Example Application (skills/using-git-worktrees/SKILL.md):

# Create worktree for feature branch
git worktree add ../project-feature-x feature/x

# Work in that directory
cd ../project-feature-x

# When done, remove worktree
git worktree remove ../project-feature-x

Why This Works:

No branch switching interrupts work
Can test multiple branches simultaneously
Clean separation of concerns
Easier subagent coordination (each in own worktree)

Replication Strategy:

Document worktree commands for your workflow
Explain when to use vs regular branching
Include cleanup procedures
Show integration with orchestration scripts

10. Schema Validation

Concept: JSON schemas define expected structure for stage outputs, enabling validation.

Key Patterns:

One Schema Per Stage - explicit output structure
Type Safety - validate data types
Required Fields - prevent missing data
Format Constraints - URLs, dates, enums

Example Application (scripts/schemas/implement-issue-setup.json):

{
  "$schema": "http://json-schema.org/draft-07/schema#",
  "type": "object",
  "required": ["branch_name", "worktree_path", "tasks"],
  "properties": {
    "branch_name": {
      "type": "string",
      "pattern": "^feature/issue-[0-9]+"
    },
    "worktree_path": {
      "type": "string"
    },
    "tasks": {
      "type": "array",
      "items": {
        "type": "object",
        "required": ["description", "agent"],
        "properties": {
          "description": {"type": "string"},
          "agent": {"type": "string"}
        }
      }
    }
  }
}

Why This Works:

Fail fast on invalid outputs
Self-documenting stage contracts
Enables reliable orchestration
Catches errors before downstream stages

Replication Strategy:

Define one schema per workflow stage
Specify all required fields
Add format validations (patterns, enums)
Validate in orchestration scripts
Use schemas as documentation

File-Specific Applications (all in scripts/schemas/):

implement-issue-setup.json: Branch and worktree creation
implement-issue-plan.json: Implementation plan structure
implement-issue-implement.json: Task completion tracking
implement-issue-test.json: Test results and coverage
implement-issue-pr.json: Pull request metadata

Cross-Stack Replication Guide

For Any Language/Framework

1. Create Directory Structure

mkdir -p .claude/{agents,hooks,prompts,scripts,skills}

2. Build Your Skill Library (Choose Your Approach)

Approach A: Adapt from Community

# Browse and copy foundational skills
cp -r superpowers/skills/using-skills .claude/skills/
cp -r superpowers/skills/test-driven-development .claude/skills/
cp -r superpowers/skills/systematic-debugging .claude/skills/

# Customize for your project
# - Update test runner commands (pytest, jest, cargo test)
# - Add your linting/formatting tools
# - Include your debugging tools and workflows

Approach B: Extract from Books/Documentation

# Example: Extract React patterns from official docs
# 1. Copy React documentation sections to file
# 2. Use Claude to extract patterns:

"I have the React documentation on hooks. Please extract:
- Key concepts into a skill (skills/react-hooks/SKILL.md)
- Common patterns and anti-patterns
- Project-specific: We use TypeScript strict mode
- Include examples using our design system"

# 3. Iterate through 3-5 refinement rounds
# 4. Add team-specific patterns from code reviews

Approach C: Capture from Experience

# Document your unique workflows
mkdir .claude/skills/deployment-workflow
mkdir .claude/skills/incident-response

# Write skills capturing your team's actual process
# Include: Tools used, commands, anti-patterns from actual incidents

3. Create Language-Specific Agents

# Example: Python Django Agent
---
name: django-backend-developer
description: Senior Python/Django developer. Use for models, views, serializers, middleware, ORM queries, migrations, and pytest.
---

You are a senior Python/Django developer with expertise in Django 5.x, Python 3.12, and PostgreSQL...

## Anti-Patterns to Avoid
- **N+1 queries** - always use `select_related()` and `prefetch_related()`
- **Never use `.filter().count()`** - use `.count()` directly
- **Use `get_object_or_404()`** - not try/except DoesNotExist

4. Configure Hooks

{
  "hooks": {
    "PostToolUse": [
      {
        "matcher": "Edit|Write",
        "hooks": [{
          "type": "command",
          "command": "black $file && isort $file",
          "timeout": 30
        }]
      }
    ]
  }
}

5. Build Orchestration Scripts

Adapt state machine to your workflow stages
Use JSON schemas for validation
Implement resume capability
Add rate limit handling

Technology-Specific Examples

React/TypeScript Project:

.claude/
├── agents/
│   ├── react-component-developer.md
│   ├── typescript-type-architect.md
│   └── jest-test-specialist.md
├── skills/
│   ├── test-driven-development/       # Adapted from community
│   ├── react-patterns/                # Extracted from React docs
│   ├── typescript-patterns/           # Extracted from TS handbook
│   ├── component-testing/             # Captured from team experience
│   └── deployment-workflow/           # Captured from team process
└── settings.json (ESLint + Prettier hooks)

Python Data Science Project:

.claude/
├── agents/
│   ├── data-engineer.md
│   ├── ml-model-developer.md
│   └── jupyter-notebook-specialist.md
├── skills/
│   ├── test-driven-development/       # Adapted from community
│   ├── data-validation/               # Extracted from "Data Quality" book
│   ├── model-evaluation/              # Extracted from ML textbooks
│   ├── visualization-patterns/        # Captured from team standards
│   └── experiment-tracking/           # Captured from workflow
└── settings.json (black + mypy hooks)

DevOps/Infrastructure Project:

.claude/
├── agents/
│   ├── terraform-architect.md
│   ├── kubernetes-operator.md
│   └── ci-cd-engineer.md
├── skills/
│   ├── systematic-debugging/          # Adapted from community
│   ├── infrastructure-as-code/        # Extracted from HashiCorp docs
│   ├── deployment-strategies/         # Extracted from "Release It!" book
│   ├── incident-response/             # Captured from actual incidents
│   └── monitoring-observability/      # Captured from team runbooks
└── settings.json (terraform fmt hooks)

Skill Source Strategy by Domain:

Domain	Adapt from Community	Extract from Books/Docs	Capture from Experience
Methodology	TDD, debugging, git	N/A	Team retrospectives
Framework	General patterns	Official documentation	Project conventions
Design	Basic principles	Design books, style guides	Design system
Architecture	SOLID, patterns	Architecture books	System decisions
DevOps	Git workflows	Tool documentation	Incident runbooks
Domain Logic	N/A	Domain textbooks	Business rules

Key Success Factors

1. Choose the Right Source for Each Skill

Don't force one approach for everything:

Methodologies (TDD, debugging) → Adapt from community
Domain expertise (CSS, security, ML) → Extract from books
Team workflows (deployment, PR process) → Capture from experience
Most skills combine multiple sources through iteration

Initial drafts are starting points, not final products:

Round 1: Get the basic structure (adapt/extract/capture)
Round 2: Add project tech stack and tools
Round 3: Include team conventions and patterns
Round 4: Add anti-patterns from real code reviews
Round 5+: Continuous refinement based on usage

Example: UI design skill evolution

Draft: Generic CSS patterns from book
Round 1: Project design tokens
Round 2: "No Tailwind" from team decision
Round 3: Blade template specifics
Round 4: Coordination with backend agent
Round 5: Real refactoring examples

3. Work Through Issues to Completion, Then Update

The Continuous Improvement Loop - Most Important Pattern

When skills, agents, or workflows fail or produce incorrect results, follow this process:

Step 1: Don't Update Yet - Solve the Problem First

❌ WRONG: Agent fails → immediately edit agent → hope it works
✅ RIGHT: Agent fails → work through to correct solution → update agent

Why this matters: You need to understand the correct solution before you can teach it. Updating before solving often encodes incorrect assumptions or partial solutions.

Step 2: Work Through to the Correct Solution

Use Claude to iteratively debug and reach the right answer:

Agent produces incorrect code → Run tests (fail)
  ↓
Analyze failure → Understand root cause
  ↓
Try fix attempt 1 → Run tests (still fail, different error)
  ↓
Analyze new failure → Refine understanding
  ↓
Try fix attempt 2 → Run tests (pass)
  ↓
Verify solution is correct, not just passing
  ↓
NOW you have the correct solution

Step 3: Ask Claude to Update the Skill/Agent

Once you have the correct solution, prompt:

"I just encountered this issue: [describe problem]

The agent/skill did: [incorrect behavior]

The correct solution was: [working solution]

Please update [skill/agent name] to prevent this issue. Add:
1. Specific guidance that would have caught this
2. An anti-pattern entry for the incorrect approach
3. An example showing the correct pattern
4. A red flag if this is a common rationalization"

Real Example from Project:

Issue Encountered:

// Agent wrote this (seems to work, but breaks in production)
public function getUsers() {
    return User::all(); // Works in dev (100 users), OOM in prod (1M users)
}

Work Through Process:

Iteration 1: Add pagination
public function getUsers() {
    return User::paginate(50); // Better, but breaks API contract
}

Iteration 2: Add chunking
public function getUsers() {
    return User::chunk(1000, function($users) {
        // Process batch
    });
} // Wrong pattern for this use case

Iteration 3: Correct solution
public function getUsers(int $page = 1, int $perPage = 50) {
    return User::paginate($perPage, ['*'], 'page', $page);
    // Returns paginated response, maintains API contract
}

Update Agent:

## Anti-Patterns to Avoid

- **NEVER use `Model::all()` on large tables**
  - Problem: Loads entire table into memory (OOM in production)
  - Symptom: Works in dev, fails in production with large datasets
  - Solution: Always use pagination: `Model::paginate($perPage)`
  - Red flag: "It works in my local database"

## Red Flags - STOP and Reconsider

- "It works with my test data" → Test with production-scale data
- "Model::all() is simpler" → Simplicity that breaks at scale is complexity

Step 4: Test the Update

Run the same scenario with updated skill/agent:

Does it now produce correct code?
Does it catch the anti-pattern?
Does it provide the right guidance?

If not, refine the update and retest.

4. Build Knowledge from Failures

Failure → Refinement Cycle:

digraph improvement {
    rankdir=LR;
    
    "Use skill/agent" [shape=box];
    "Issue occurs" [shape=diamond];
    "Work through to correct solution" [shape=box, style=filled, fillcolor=yellow];
    "Understand root cause" [shape=box, style=filled, fillcolor=yellow];
    "Update skill/agent" [shape=box, style=filled, fillcolor=lightgreen];
    "Add anti-pattern" [shape=box];
    "Add red flag" [shape=box];
    "Test updated version" [shape=box];
    
    "Use skill/agent" -> "Issue occurs";
    "Issue occurs" -> "Continue working" [label="no issue"];
    "Issue occurs" -> "Work through to correct solution" [label="issue found"];
    "Work through to correct solution" -> "Understand root cause";
    "Understand root cause" -> "Update skill/agent";
    "Update skill/agent" -> "Add anti-pattern";
    "Add anti-pattern" -> "Add red flag";
    "Add red flag" -> "Test updated version";
    "Test updated version" -> "Use skill/agent" [label="improvement verified"];
}

Track Patterns Across Failures:

Keep a log of common issues:

## Common Issues Log

### Issue: Agent uses Model::all() on large tables
- Occurred: 3 times (UserService, OrderService, ProductService)
- Root cause: Agent doesn't consider production data scale
- Solution: Added "NEVER use Model::all()" anti-pattern
- Prevention: Added red flag "Works in dev" → "Test at scale"
- Result: Zero occurrences after update

### Issue: Tests with sleep() instead of event-based waiting
- Occurred: 5 times (async operations, polling, race conditions)
- Root cause: Agent defaults to timing instead of conditions
- Solution: Added condition-based-waiting skill
- Prevention: Red flag "Use sleep() to wait"
- Result: All new tests use proper wait patterns

Example 1: Laravel Backend Agent

Initial Version (Generic):

description: PHP/Laravel backend developer
Anti-patterns:
  - Write clean code
  - Follow best practices

After Issue #1 (N+1 queries in UserController):

Anti-patterns:
  - **N+1 prevention** — Always eager load with `with()`
  - Never use `Model::all()` on large tables

After Issue #2 (Used env() in Service class):

Anti-patterns:
  - **N+1 prevention** — Always eager load with `with()`
  - **Never use `env()`** outside config files — Use `config()` helper
  - Never use `Model::all()` on large tables

After Issue #3 (Tests lacked RefreshDatabase):

Anti-patterns:
  - **N+1 prevention** — Always eager load with `with()`
  - **Never use `env()`** outside config files
  - Never use `Model::all()` on large tables
  - **Missing `RefreshDatabase`** in feature tests — Tests contaminate each other

Red Flags:
  - "Tests pass locally but fail in CI" → Missing RefreshDatabase
  - "Works in dev" → Test with production-scale data

Example 2: Test Validator Agent

Initial Version:

Validate tests have assertions

After Hollow Test Issue:

### Hollow Assertions

Tests that pass but don't verify behavior:

```php
// FAIL: Only asserting response code, not content
public function test_api_returns_users(): void
{
    $response = $this->get('/api/users');
    $response->assertOk(); // What about the users?
}

Flag: Response checks without data validation


**After Mock Abuse Issue**:
```markdown
### Hollow Assertions
[previous content]

### Brittle/Cheating Mocks

Mocks that bypass the actual logic being tested:

```php
// FAIL: Mocking the system under test
public function test_user_service(): void
{
    $service = $this->createMock(UserService::class);
    $service->method('createUser')->willReturn(new User());
    // Tests nothing!
}

Flag: Mocking the class being tested


### 6. When to Update vs When to Discard

**Update When**:
- Issue is fixable with clearer guidance
- Pattern is close but needs refinement
- Anti-pattern can prevent future occurrences
- Skill/agent is fundamentally sound

**Discard/Rewrite When**:
- Fundamental approach is wrong
- Skill fights against better patterns
- Multiple unrelated issues from same skill
- Easier to start fresh than patch

**Example - Discard**:
```markdown
# Original skill: "Always use mocks in unit tests"
# After issues: Actually need real objects for domain logic tests
# Decision: Skill fundamentally wrong, rewrite with nuance

# New skill: "Use mocks for boundaries, real objects for domain"

7. Pressure Testing After Updates

After updating a skill/agent, test under pressure:

Create Scenarios That Previously Failed:

Updated agent to avoid Model::all()
  ↓
Test: "Create a service that fetches all users"
  ↓
Does agent now use pagination?
  ↓
YES: Update verified
NO: Refine update, add more explicit guidance

Test Related Patterns:

Updated: "Never use env() outside config"
  ↓
Test: "Read database connection settings in service"
  ↓
Does agent use config('database.connection')?
  ↓
Verify it doesn't fall back to env()

Test Under Time Pressure:

Add to skill: "This is urgent, just make it work"
  ↓
Does agent still follow anti-patterns?
  ↓
If yes: Add stronger language, make non-negotiable

8. Track Improvements Over Time

Maintain a changelog for each skill/agent:

## CHANGELOG

### 2025-01-15: Added N+1 query prevention
- Issue: UserController loaded all orders without eager loading
- Solution: Added "Always use with()" anti-pattern
- Verification: Tested with large datasets, no more N+1s

### 2025-01-20: Added env() restriction
- Issue: Service class called env('API_KEY') directly
- Solution: Added "Never use env() outside config" rule
- Verification: Scanned codebase, all env() calls in config/

### 2025-01-25: Added RefreshDatabase reminder
- Issue: Feature tests contaminating each other
- Solution: Added "Missing RefreshDatabase" anti-pattern
- Verification: All new tests include trait

Benefits:

See evolution over time
Understand why rules exist
Share learning with team
Identify patterns in failures

9. Test with Real Tasks

Validate before relying on skills/agents:

Skills: Run pressure scenarios with subagents
Agents: Test on representative domain tasks
Hooks: Verify in actual workflow
Keep iterating until they work reliably

10. Document Rationale and Sources

Make origins and reasoning clear:

Note source (community/book/experience)
Explain why patterns exist
Document what problems they solve
Include attribution for adapted/extracted content

11. Maintain Discoverability

Keep skills findable:

Rich, searchable descriptions
Clear naming conventions
Cross-references between components
Regular pruning of unused skills

12. Balance Generic and Specific

Find the right level of abstraction:

Too generic → Not actionable for your project
Too specific → Breaks when project evolves
Sweet spot → Project-specific but adaptable

Example balance:

# Too generic (not useful)
"Write good CSS"

# Too specific (breaks easily)
"Use class .btn-primary-lg-blue from line 47 of app.css"

# Right balance (project-specific but adaptable)
"Use semantic button classes from design system tokens
- .btn--primary for main actions
- .btn--secondary for supporting actions
See /resources/css/components/buttons.css"

Common Patterns Across Files

Pattern: Flowchart-Driven Decision Making

Files: All major skills (TDD, subagent-driven-development, dispatching-parallel-agents)

Concept: Visual flowcharts clarify when to use a pattern and how to execute it.

Implementation:

digraph decision {
    "Have plan?" [shape=diamond];
    "Tasks independent?" [shape=diamond];
    "Use subagent workflow" [shape=box];
}

Why: Reduces cognitive load, provides clear decision criteria, visually communicates process.

Replicate For: Any multi-step process with decision points.

Pattern: Red Flags / Rationalization Tables

Files: TDD, writing-skills, subagent-driven-development

Concept: Anticipate and counter common justifications for skipping best practices.

Implementation:

| Excuse | Reality |
|--------|---------|
| "Too simple to test" | Simple code breaks. Test takes 30 seconds. |
| "I'll test after" | Tests passing immediately prove nothing. |

Why: Pre-emptively addresses resistance to discipline, makes violations obvious.

Replicate For: Any prescriptive methodology that might be circumvented under pressure.

Pattern: Skill-Specific Supporting Files

Files: systematic-debugging, ui-design-fundamentals, bulletproof-frontend

Concept: Main SKILL.md stays concise, detailed patterns in separate files.

Implementation:

skills/
  ui-design-fundamentals/
    SKILL.md              # Overview + quick reference
    buttons.md            # Button-specific patterns
    forms.md              # Form-specific patterns
    navigation.md         # Navigation patterns

Why: Keeps main file scannable while providing depth when needed.

Replicate For: Skills with multiple sub-domains or extensive reference material.

Pattern: Explicit State Tracking

Files: implement-issue-orchestrator.sh, subagent-driven-development

Concept: Maintain explicit state that persists across subagent invocations.

Implementation:

# Track branch name explicitly
FEATURE_BRANCH="feature/issue-123"

# Include in every subagent dispatch
dispatch_implementer "$FEATURE_BRANCH" "$task_text"

Why: Subagents have no memory, must receive all context explicitly.

Replicate For: Multi-step workflows with fresh subagent per step.

Pattern: Two-Stage Review

Files: subagent-driven-development, code-reviewer

Concept: Separate spec compliance from code quality - different concerns, different reviewers.

Implementation:

1. Implement task
2. Spec reviewer: Does it match requirements?
3. Code quality reviewer: Is it well-built?

Why: Spec compliance prevents over/under-building, quality review ensures good implementation.

Replicate For: Any implementation workflow where "right thing" differs from "right way".

Conclusion

This project demonstrates a practical AI development system built through:

Multiple Skill Sources - Community adaptation, book extraction, experience capture
Project-Specific Automation - Custom agents, hooks, and orchestration scripts
Workflow Integration - Multi-stage pipelines with state management
Domain Specialization - Agents with clear scope and coordination protocols
Continuous Improvement - Failure-driven refinement loop

The Three-Path Strategy:

Skills can be built through different approaches depending on the type:

Path 1: Adapt from Community

Start: Browse obra/superpowers for foundational patterns
Customize: Update examples to your tech stack
Example: TDD, debugging, git workflows

Path 2: Extract from Books/Docs

Start: Convert authoritative source to text
Process: Have Claude extract concepts into skill format
Refine: Add project context through multiple rounds
Example: UI design fundamentals from "Handcrafted CSS"

Path 3: Capture from Experience

Start: Identify recurring team patterns
Document: Write skill with real anti-patterns
Example: GitHub workflow, PR processes

The Critical Improvement Loop:

The most important pattern: Work through issues to the correct solution BEFORE updating skills/agents.

Issue occurs → Work through to correct solution → Understand root cause
    ↓
Update skill/agent → Add anti-pattern → Add red flag → Test update
    ↓
Verify improvement → Log the learning → Continue

This loop is what makes the system self-improving:

Skills get better with each failure
Anti-patterns accumulate real experience
Red flags prevent future rationalizations
The system learns from actual usage

What Worked in This Project:

Skill Type	Approach	Refinement Rounds	Key Improvements
TDD methodology	Adapted from community	2-3	Added project test runners
Debugging patterns	Adapted from community	3-4	Added project-specific tools
UI design fundamentals	Book extraction	5+	Design tokens, no Tailwind, Blade specifics
Frontend architecture	Book extraction	5+	Project CSS architecture, agent coordination
GitHub workflows	Experience capture	10+	Real workflow failures → anti-patterns
Laravel backend agent	Created + experience	15+	N+1 queries, env() usage, RefreshDatabase
Test validator agent	Created + experience	8+	Hollow assertions, mock abuse, TODOs
Orchestration scripts	Created from scratch	20+	Resume capability, rate limits, state management

Notice: More complex components (agents, orchestrators) had more refinement rounds because they encountered more real-world issues.

The Real Work:

Regardless of approach, the value comes from:

Iterative refinement - Initial draft → project context → team patterns → anti-patterns from failures
Working through issues - Don't update until you understand the correct solution
Testing with real tasks - Pressure test skills, validate agents on actual work
Capturing failures - Each issue becomes an anti-pattern or red flag
Coordination between components - Agents reference skills, hooks trigger scripts, schemas validate
Building institutional knowledge - System improves as it encounters and solves problems

Getting Started:

Pick 5-10 foundational skills - TDD, debugging, git workflows (adapt from community)
Create 2-3 domain skills - Your framework/language expertise (extract from books/docs)
Document 1-2 team workflows - Your unique processes (capture from experience)
Build 2-3 specialized agents - With project context and coordination rules
Add automation hooks - For repetitive quality gates
Use the system - Let it fail, work through issues, update, repeat
Track improvements - Log common issues and how skills evolved

Key Success Factor:

Don't expect perfection on round 1. The first version of a skill/agent is a hypothesis. Real usage reveals issues. Working through those issues to the correct solution, then encoding that knowledge back into the skill/agent—that's what makes the system valuable.

Initial drafts get you started. The improvement loop is what makes it great.

After 6 months of use:

Generic skills become project-specific
Anti-patterns reflect actual mistakes
Red flags catch real rationalizations
Agents coordinate smoothly
Workflows handle edge cases
The system embodies team knowledge

This is infrastructure that improves with use, not documentation that rots. The effort compounds.

Claude Project Implementation Patterns Guide

Overview

Table of Contents

Quick Reference: Three Paths to Skills

The Most Valuable Patterns

Core Philosophy

Directory Structure & Purpose

Implementation Concepts

1. Skill Creation Approaches

Approach A: Adapt from Community

Approach B: Extract from Books/Documentation

Approach C: Capture from Experience

Common Patterns Across All Approaches

2. Agents System

3. Hook System

4. Orchestration Scripts

5. Implement-Issue: End-to-End Workflow

Architecture Overview

The 11-Stage Pipeline

Quality Loops (Prevent Infinite Iterations)

State Management

GitHub Integration

Error Handling

Logging System

Monitoring

Integration with Other Components

Key Design Decisions

Replication for Your Stack

Real-World Performance

Why This Pattern Matters

6. Test Validation Methodology

The Problem with "Tests Pass"

Two-Phase Validation

Test Validator Agent

The Seven Deadly Test Sins

Validation Process (Five Steps)

Integration with Implement-Issue Workflow

Decision Framework

Cross-Language Adaptation

Key Insights

Replication Strategy

7. Project-Specific Skills

8. Prompt Templates

11. Foundational Skills from Multiple Sources

9. Git Worktrees

10. Schema Validation

Cross-Stack Replication Guide

For Any Language/Framework

Technology-Specific Examples

Key Success Factors

1. Choose the Right Source for Each Skill

2. Expect Multiple Refinement Rounds

3. Work Through Issues to Completion, Then Update

4. Build Knowledge from Failures

5. Examples of Iterative Refinement

7. Pressure Testing After Updates

8. Track Improvements Over Time

9. Test with Real Tasks

10. Document Rationale and Sources

11. Maintain Discoverability

12. Balance Generic and Specific

Common Patterns Across Files

Pattern: Flowchart-Driven Decision Making

Pattern: Red Flags / Rationalization Tables

Pattern: Skill-Specific Supporting Files

Pattern: Explicit State Tracking

Pattern: Two-Stage Review

Conclusion