EvanFlow – A TDD driven feedback loop for Claude Code
EvanFlow is a TDD-driven, iterative feedback framework for Claude Code that guides software development through structured phases—brainstorming, planning, execution, testing, and iteration—with built-in checkpoints to maintain user control. It emphasizes discipline, avoids auto-commits or forced workflows, and integrates safety rules to prevent common AI coding failures. The system uses 16 skills and two custom subagents, supports parallel task handling, and includes guardrails against risky Git operations. Installation is streamlined via plugin, CLI, or manual setup, with customization encouraged.
- ▪EvanFlow enforces a controlled, iterative development loop with user-approved checkpoints at design, plan, and iteration stages, never auto-committing code.
- ▪It includes 16 skills and 2 custom subagents for tasks like brainstorming, TDD, debugging, and parallel coding, all organized under a conductor skill triggered by saying 'let's evanflow this'.
- ▪Hard rules prevent hallucinated values, enforce assertion correctness in tests, and block dangerous Git operations using a pre-tool hook that activates automatically with plugin installation.
- ▪The framework supports parallel development with coder/overseer pairs and integration testing to maintain interface consistency across modules.
- ▪Users can install EvanFlow via the Claude Code plugin marketplace, npx CLI, or manual copy, with recommended dependencies including Bash, jq, and Chromium for full functionality.
Full article excerpt tap to expand
EvanFlow A TDD-driven iterative feedback loop for software development with Claude Code. 16 cohesive skills + 2 custom subagents walk an idea from brainstorm through implementation, with checkpoints throughout where you stay in control. One entry point: say "let's evanflow this" and the orchestrator runs the loop. brainstorm → plan → execute (sequential or parallel) → tdd → iterate → STOP The loop is conductor, not autopilot: real checkpoints at design approval, plan approval, and after iteration. The agent stops short of every git operation and waits for your direction. No auto-commits. No forced ceremony. No "must invoke a skill" tax. Quick Install The recommended path — Claude Code's plugin marketplace: /plugin marketplace add evanklem/evanflow /plugin install evanflow@evanflow Restart, then try: "Let's evanflow this — I want to add a small feature that does X." evanflow-go fires and walks the loop. The git-guardrails hook auto-activates with the plugin (no settings.json edit needed). Skills appear under the evanflow: namespace (e.g., /evanflow:evanflow-go). See Installation below for two alternative paths. What Makes It a Feedback Loop The loop is built around discipline that compounds across iterations, not single-shot generation. Every step has a checkpoint that gates the next: Brainstorm clarifies intent, proposes 2–3 approaches with embedded grill (stress-test) → you approve the design Plan maps file structure first (deep modules, deletion test) → you approve the plan Execute runs task-by-task with inline verification → blockers stop the loop and surface to you TDD is vertical-slice only: one failing test → minimal impl → repeat. Tests verify behavior through public interfaces, so they survive refactors Iterate re-reads the diff with fresh eyes, runs quality checks, screenshots UI changes, and runs against a Five Failure Modes checklist (hallucinated actions, scope creep, cascading errors, context loss, tool misuse). Hard cap of 5 iterations STOP. Report. Await your direction. The agent never auto-commits, never auto-stages, never proposes a PR For plans with 3+ truly independent units, the loop forks into a parallel coder/overseer orchestration: one coder per unit (using vertical-slice TDD with a RED checkpoint), one overseer per coder (read-only review subagent that can't modify code), plus an integration overseer that runs named integration tests at every touchpoint. The integration tests are the executable contract — interfaces can't drift if both sides have to satisfy the same passing test. Hard Rules Baked Into the Loop Several rules come from 2025-2026 industry research on agentic coding failure modes and are baked into every skill: Never invent values — file paths, env vars, IDs, function names, library APIs. If unsure, the agent stops and asks. (Action-hallucination is the most dangerous agent failure.) Assertion-correctness warning — research shows 62% of LLM-generated test assertions are wrong. Both evanflow-tdd and the overseer review explicitly check whether a one-character bug in the implementation would still let the assertion pass. Watch for context drift — evanflow-compact triggers when symptoms appear (re-asking established questions, contradicting earlier decisions). Industry data: ~65% of enterprise AI coding failures trace to context drift, not raw token exhaustion. Five Failure Modes pass in iterate + overseer review — explicit check against hallucinated actions, scope creep, cascading errors, context…
This excerpt is published under fair use for community discussion. Read the full article at GitHub.