TDD

by @Anshin-Health-Solutions in Tools

# Install this skill:

npx skills add Anshin-Health-Solutions/superpai --skill "TDD"

Install specific skill from multi-skill repository

# Description

# SKILL.md

name: TDD
description: >
Test-Driven Development discipline. Use before writing any implementation code.
Enforces RED-GREEN-REFACTOR Iron Law. Write test first, watch it fail, implement
minimum code, verify it passes, then refactor.
triggers:
- "write tests"
- "tdd"
- "test driven"
- "test first"
- "red green refactor"
- "implement feature"
- "fix bug"
- "add function"
- "build feature"
- "create endpoint"
- "write code"
- "implement"
- "new feature"
- "bug fix"

Test-Driven Development (TDD)

The Iron Law

NO PRODUCTION CODE WITHOUT A FAILING TEST FIRST

Write code before the test? Delete it. Start over with TDD.

No exceptions:
- Do not keep it as "reference"
- Do not "adapt" it while writing tests
- Do not look at it
- Delete means delete

Implement fresh from tests only.

When to Use

Always:
- New features
- Bug fixes
- Refactoring
- Behavior changes

Exceptions (ask the user first):
- Throwaway prototypes
- Generated boilerplate
- Pure configuration files

Thinking "skip TDD just this once"? Stop. That is rationalization.

RED-GREEN-REFACTOR Cycle

Step 1 - RED: Write One Failing Test

Write the minimal test that describes the desired behavior.

Good test:

test('retries failed operations 3 times', async () => {
  let attempts = 0;
  const operation = () => {
    attempts++;
    if (attempts < 3) throw new Error('fail');
    return 'success';
  };
  const result = await retryOperation(operation);
  expect(result).toBe('success');
  expect(attempts).toBe(3);
});

Bad test (tests mocks, not behavior):

test('retry works', async () => {
  const mock = jest.fn()
    .mockRejectedValueOnce(new Error())
    .mockResolvedValueOnce('success');
  await retryOperation(mock);
  expect(mock).toHaveBeenCalledTimes(3); // Tests the mock, not the code
});

Test requirements:
- One behavior per test
- Clear descriptive name
- Tests real code paths (mocks only when unavoidable)
- "and" in test name means split it into two tests

Step 2 - Verify RED: Watch It Fail (MANDATORY - Never Skip)

npm test path/to/test.test.ts

Confirm all three:
1. Test fails (not errors out)
2. Failure message describes the missing behavior
3. Fails because feature is absent, not because of typos or import errors

Test passes immediately? You are testing existing behavior. Fix the test.
Test errors? Fix the error and re-run until it fails for the right reason.

Skipping this step means you do not know if your test actually tests anything.

Step 3 - GREEN: Write Minimal Code to Pass

Write the simplest code that makes the test pass. Nothing more.

Good (just enough):

async function retryOperation<T>(fn: () => Promise<T>): Promise<T> {
  for (let i = 0; i < 3; i++) {
    try {
      return await fn();
    } catch (e) {
      if (i === 2) throw e;
    }
  }
  throw new Error('unreachable');
}

Bad (over-engineered before tests demand it):

async function retryOperation<T>(
  fn: () => Promise<T>,
  options?: {
    maxRetries?: number;
    backoff?: 'linear' | 'exponential';
    onRetry?: (attempt: number) => void;
  }
): Promise<T> { ... }

Do not add features, refactor unrelated code, or build "while you are in here."

Step 4 - Verify GREEN: Confirm Tests Pass (MANDATORY)

npm test path/to/test.test.ts

Confirm all three:
1. The new test passes
2. All existing tests still pass
3. Output is clean (no errors, no warnings)

Test still fails? Fix code, not test.
Other tests break? Fix them now before continuing.

Step 5 - Verify RED Still Works: Delete and Restore

This step proves your test actually catches the bug.

Temporarily delete or comment out the implementation you just wrote. Run tests. Confirm the test fails again. Restore the implementation. Confirm tests pass again.

If deleting the code does not cause the test to fail, the test is not testing what you think it is testing.

Step 6 - REFACTOR: Clean Up Under Green

Only after all tests pass:
- Remove duplication
- Improve variable and function names
- Extract helper functions
- Improve structure

Rules during refactor:
- Keep all tests green throughout
- Do not add new behavior
- Run tests after each change

Repeat

Write the next failing test for the next piece of behavior.

Anti-Rationalization Table

Excuse	Reality
"Too simple to test"	Simple code breaks. Test takes 30 seconds.
"I will write tests after"	Tests written after implementation pass immediately. Passing immediately proves nothing.
"Tests after achieve the same goals"	Tests-after answer "What does this do?" Tests-first answer "What should this do?"
"I already manually tested all edge cases"	Manual testing is ad-hoc. No record, cannot re-run, easy to forget under pressure.
"Deleting X hours of work is wasteful"	Sunk cost fallacy. Keeping unverified code is the actual waste.
"I will keep it as reference and write tests first"	You will adapt it. That is tests-after. Delete means delete.
"I need to explore the design first"	Fine. Throw away the exploration. Start fresh with TDD.
"Tests are hard to write for this"	Listen to that signal. Hard to test means hard to use. Simplify the design.
"TDD will slow me down"	TDD is faster than debugging production failures. Pragmatic means test-first.
"This is different because..."	It is not different. Delete. Start over.
"Existing code has no tests"	You are adding to it now. Add tests for what you touch.

Testing Anti-Patterns to Avoid

Never Test Mock Behavior

// BAD: Testing that the mock was called, not that code behaves correctly
test('saves user', () => {
  const mockDb = { save: jest.fn() };
  saveUser(mockDb, userData);
  expect(mockDb.save).toHaveBeenCalledTimes(1); // Tests mock, not behavior
});

// GOOD: Test the actual outcome
test('saves user', async () => {
  const result = await saveUser(realDbClient, userData);
  expect(result.id).toBeDefined();
  expect(result.email).toBe(userData.email);
});

Gate: Before asserting on any mock, ask "Am I testing real behavior or mock existence?" If mock existence, delete the assertion.

Never Add Test-Only Methods to Production Code

// BAD: destroy() only exists for test cleanup
class Session {
  async destroy() { ... } // Pollutes production class
}

// GOOD: Keep cleanup in test utilities
// test-utils/cleanup.ts
export async function cleanupSession(session: Session) { ... }

Gate: Before adding any method to a production class, ask "Is this only called in tests?" If yes, put it in test utilities instead.

Never Mock Without Understanding Dependencies

// BAD: Mock breaks the test's own logic
test('detects duplicate', () => {
  vi.mock('ConfigStore'); // Mocked the thing that writes config
  await addEntry(config);
  await addEntry(config); // Should fail - but config was never written!
});

// GOOD: Understand what the test needs, mock only external/slow parts
test('detects duplicate', () => {
  vi.mock('NetworkClient'); // Mock the slow part only
  await addEntry(config);   // Config written correctly
  await addEntry(config);   // Duplicate detected correctly
});

Gate: Before mocking, ask what side effects the real method has and whether this test depends on any of them.

Never Create Incomplete Mocks

// BAD: Only mocked fields you thought you needed
const mockResponse = { status: 'success', data: { id: '123' } };
// Breaks when downstream code accesses response.metadata.requestId

// GOOD: Mirror the complete real API structure
const mockResponse = {
  status: 'success',
  data: { id: '123', name: 'Alice' },
  metadata: { requestId: 'req-789', timestamp: 1234567890 }
};

Red Flags - Stop and Start Over

Any of these means delete the code and restart with TDD:

You wrote production code before a test
Test was written after implementation
Test passed immediately on first run
You cannot explain why the test failed
Tests are planned for "later"
You are rationalizing "just this once"
"I already manually tested it thoroughly"
"Deleting is wasteful after all this work"
"TDD is dogmatic, I am being pragmatic"
"This situation is different because..."
Mock setup is longer than the test logic
You are asserting on test IDs containing "-mock"

Example: Bug Fix Workflow

Bug report: Empty email is accepted by the form.

Step 1 - Write failing test:

test('rejects empty email', async () => {
  const result = await submitForm({ email: '' });
  expect(result.error).toBe('Email required');
});

Step 2 - Run and watch it fail:

$ npm test
FAIL: expected 'Email required', received undefined

Good. The test fails for the right reason - the validation is missing.

Step 3 - Write minimal fix:

function submitForm(data: FormData) {
  if (!data.email?.trim()) {
    return { error: 'Email required' };
  }
  // rest of form handling
}

Step 4 - Run and watch it pass:

$ npm test
PASS
All tests: 47 passed

Step 5 - Delete implementation and verify test catches it:
Comment out the validation. Run tests. Confirm FAIL. Restore. Confirm PASS.

Step 6 - Refactor if needed.
If validation logic should be extracted for reuse with other fields, extract it now.

Verification Checklist

Before marking any work complete:

[ ] Every new function or method has at least one test
[ ] Watched each test fail before writing implementation
[ ] Each failure was for the expected reason (missing feature, not syntax error)
[ ] Deleted and restored implementation to confirm test catches the bug
[ ] Wrote minimal code to pass each test (no YAGNI violations)
[ ] All tests pass including pre-existing suite
[ ] Output is clean with no errors or warnings
[ ] Tests exercise real code paths (mocks only where genuinely unavoidable)
[ ] Edge cases and error paths covered

Cannot check all boxes? You skipped TDD. Delete code. Start over.

When Stuck

Problem	Solution
Do not know how to write the test	Write the wished-for API. Start with the assertion. Ask the user.
Test is extremely complicated	Design is too complicated. Simplify the interface first.
Everything requires mocking	Code is too coupled. Introduce dependency injection.
Test setup is massive	Extract setup helpers. If still huge, the design needs simplifying.
Bug is in UI interaction	Write the test against the behavior the UI produces, not the clicks.

Integration with Debugging

Bug found in production? Never fix it without a test:
1. Write a test that reproduces the bug
2. Confirm the test fails
3. Fix the code
4. Confirm the test passes
5. Confirm the test fails again when you delete the fix

The test is now the regression guard. This bug cannot silently return.

Final Rule

Production code exists -> a test was written first and watched to fail
Otherwise -> it is not TDD

No exceptions without explicit permission from the user.

# Supported AI Coding Agents

This skill is compatible with the SKILL.md standard and works with all major AI coding agents:

⚡ Amp 🚀 Antigravity 🤖 Claude Code 🦀 Clawdbot 📝 Codex ▶️ Cursor 🤖 Droid 💎 Gemini CLI 🐙 GitHub Copilot 🪿 Goose 📊 Kilo Code 🔧 Kiro CLI 💻 OpenCode 🦘 Roo Code 🌲 Trae 🏄 Windsurf

Learn more about the SKILL.md standard and how to use these skills with your preferred AI coding agent.

TDD