alinaqi

code-deduplication

457
37
# Install this skill:
npx skills add alinaqi/claude-bootstrap --skill "code-deduplication"

Install specific skill from multi-skill repository

# Description

Prevent semantic code duplication with capability index and check-before-write

# SKILL.md


name: code-deduplication
description: Prevent semantic code duplication with capability index and check-before-write


Code Deduplication Skill

Load with: base.md

Purpose: Prevent semantic duplication and code bloat. Maintain a capability index so Claude always knows what exists before writing something new.


Core Philosophy

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚  CHECK BEFORE YOU WRITE                                         β”‚
β”‚  ─────────────────────────────────────────────────────────────  β”‚
β”‚  AI doesn't copy/paste - it reimplements.                       β”‚
β”‚  The problem isn't duplicate code, it's duplicate PURPOSE.      β”‚
β”‚                                                                 β”‚
β”‚  Before writing ANY new function:                               β”‚
β”‚  1. Check CODE_INDEX.md for existing capabilities               β”‚
β”‚  2. Search codebase for similar functionality                   β”‚
β”‚  3. Extend existing code if possible                            β”‚
β”‚  4. Only create new if nothing suitable exists                  β”‚
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚  AFTER WRITING: Update the index immediately.                   β”‚
β”‚  PERIODICALLY: Run /audit-duplicates to catch overlap.          β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

Code Index Structure

Maintain CODE_INDEX.md in project root, organized by capability not file location:

# Code Index

*Last updated: [timestamp]*
*Run `/update-code-index` to regenerate*

## Quick Reference

| Category | Count | Location |
|----------|-------|----------|
| Date/Time | 5 functions | src/utils/dates.ts |
| Validation | 8 functions | src/utils/validate.ts |
| API Clients | 12 functions | src/api/*.ts |
| Auth | 6 functions | src/auth/*.ts |

---

## Date/Time Operations

| Function | Location | Does What | Params |
|----------|----------|-----------|--------|
| `formatDate()` | utils/dates.ts:15 | Formats Date β†’ "Jan 15, 2024" | `(date: Date, format?: string)` |
| `formatRelative()` | utils/dates.ts:32 | Formats Date β†’ "2 days ago" | `(date: Date)` |
| `parseDate()` | utils/dates.ts:48 | Parses string β†’ Date | `(str: string, format?: string)` |
| `isExpired()` | auth/tokens.ts:22 | Checks if timestamp past now | `(timestamp: number)` |
| `addDays()` | utils/dates.ts:61 | Adds days to date | `(date: Date, days: number)` |

---

## Validation

| Function | Location | Does What | Params |
|----------|----------|-----------|--------|
| `isEmail()` | utils/validate.ts:10 | Validates email format | `(email: string)` |
| `isPhone()` | utils/validate.ts:25 | Validates phone with country | `(phone: string, country?: string)` |
| `isURL()` | utils/validate.ts:42 | Validates URL format | `(url: string)` |
| `isUUID()` | utils/validate.ts:55 | Validates UUID v4 | `(id: string)` |
| `sanitizeHTML()` | utils/sanitize.ts:12 | Strips XSS from input | `(html: string)` |
| `sanitizeSQL()` | utils/sanitize.ts:28 | Escapes SQL special chars | `(input: string)` |

---

## String Operations

| Function | Location | Does What | Params |
|----------|----------|-----------|--------|
| `slugify()` | utils/strings.ts:8 | Converts to URL slug | `(str: string)` |
| `truncate()` | utils/strings.ts:20 | Truncates with ellipsis | `(str: string, len: number)` |
| `capitalize()` | utils/strings.ts:32 | Capitalizes first letter | `(str: string)` |
| `pluralize()` | utils/strings.ts:40 | Adds s/es correctly | `(word: string, count: number)` |

---

## API Clients

| Function | Location | Does What | Returns |
|----------|----------|-----------|---------|
| `fetchUser()` | api/users.ts:15 | GET /users/:id | `Promise<User>` |
| `fetchUsers()` | api/users.ts:28 | GET /users with pagination | `Promise<User[]>` |
| `createUser()` | api/users.ts:45 | POST /users | `Promise<User>` |
| `updateUser()` | api/users.ts:62 | PATCH /users/:id | `Promise<User>` |
| `deleteUser()` | api/users.ts:78 | DELETE /users/:id | `Promise<void>` |

---

## Error Handling

| Function/Class | Location | Does What |
|----------------|----------|-----------|
| `AppError` | utils/errors.ts:5 | Base error class with code |
| `ValidationError` | utils/errors.ts:20 | Input validation failures |
| `NotFoundError` | utils/errors.ts:32 | Resource not found |
| `handleAsync()` | utils/errors.ts:45 | Wraps async route handlers |
| `errorMiddleware()` | middleware/error.ts:10 | Express error handler |

---

## Hooks (React)

| Hook | Location | Does What |
|------|----------|-----------|
| `useAuth()` | hooks/useAuth.ts | Auth state + login/logout |
| `useUser()` | hooks/useUser.ts | Current user data |
| `useDebounce()` | hooks/useDebounce.ts | Debounces value changes |
| `useLocalStorage()` | hooks/useLocalStorage.ts | Persisted state |
| `useFetch()` | hooks/useFetch.ts | Data fetching with loading/error |

---

## Components (React)

| Component | Location | Does What |
|-----------|----------|-----------|
| `Button` | components/Button.tsx | Styled button with variants |
| `Input` | components/Input.tsx | Form input with validation |
| `Modal` | components/Modal.tsx | Dialog overlay |
| `Toast` | components/Toast.tsx | Notification popup |
| `Spinner` | components/Spinner.tsx | Loading indicator |

File Header Format

Every file should have a summary header:

TypeScript/JavaScript

/**
 * @file User authentication utilities
 * @description Handles login, logout, session management, and token refresh.
 *
 * Key exports:
 * - login(email, password) - Authenticates user, returns tokens
 * - logout() - Clears session and tokens
 * - refreshToken() - Gets new access token
 * - validateSession() - Checks if session is valid
 *
 * @see src/api/auth.ts for API endpoints
 * @see src/hooks/useAuth.ts for React hook
 */

import { ... } from '...';

Python

"""
User authentication utilities.

Handles login, logout, session management, and token refresh.

Key exports:
    - login(email, password) - Authenticates user, returns tokens
    - logout() - Clears session and tokens
    - refresh_token() - Gets new access token
    - validate_session() - Checks if session is valid

See Also:
    - src/api/auth.py for API endpoints
    - src/services/user.py for user operations
"""

from typing import ...

Function Documentation

Every function needs a one-line summary:

TypeScript

/**
 * Formats a date into a human-readable relative string.
 * Examples: "2 minutes ago", "yesterday", "3 months ago"
 */
export function formatRelative(date: Date): string {
  // ...
}

/**
 * Validates email format and checks for disposable domains.
 * Returns true for valid non-disposable emails.
 */
export function isValidEmail(email: string): boolean {
  // ...
}

Python

def format_relative(date: datetime) -> str:
    """Formats a date into a human-readable relative string.

    Examples: "2 minutes ago", "yesterday", "3 months ago"
    """
    ...

def is_valid_email(email: str) -> bool:
    """Validates email format and checks for disposable domains.

    Returns True for valid non-disposable emails.
    """
    ...

Check Before Write Process

Before Creating ANY New Function

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚  BEFORE WRITING NEW CODE                                        β”‚
β”‚  ─────────────────────────────────────────────────────────────  β”‚
β”‚                                                                 β”‚
β”‚  1. DESCRIBE what you need in plain English                     β”‚
β”‚     "I need to format a date as relative time"                  β”‚
β”‚                                                                 β”‚
β”‚  2. CHECK CODE_INDEX.md                                         β”‚
β”‚     Search for: date, time, format, relative                    β”‚
β”‚     β†’ Found: formatRelative() in utils/dates.ts                 β”‚
β”‚                                                                 β”‚
β”‚  3. EVALUATE if existing code works                             β”‚
β”‚     - Does it do what I need? β†’ Use it                          β”‚
β”‚     - Close but not quite? β†’ Extend it                          β”‚
β”‚     - Nothing suitable? β†’ Create new, update index              β”‚
β”‚                                                                 β”‚
β”‚  4. If extending, check for breaking changes                    β”‚
β”‚     - Add optional params, don't change existing behavior       β”‚
β”‚     - Update tests for new functionality                        β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

Decision Tree

Need new functionality
        β”‚
        β–Ό
Check CODE_INDEX.md for similar
        β”‚
        β”œβ”€β–Ί Found exact match ──────► USE IT
        β”‚
        β”œβ”€β–Ί Found similar ──────────► Can it be extended?
        β”‚                                   β”‚
        β”‚                    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
        β”‚                    β–Ό                             β–Ό
        β”‚               Yes: Extend                   No: Create new
        β”‚               (add params)                  (update index)
        β”‚
        └─► Nothing found ──────────► Create new (update index)

Common Duplication Patterns

Pattern 1: Utility Function Reimplementation

❌ Bad: Creating validateEmail() when isEmail() exists

// DON'T: This already exists as isEmail()
function validateEmail(email: string): boolean {
  return /^[^\s@]+@[^\s@]+\.[^\s@]+$/.test(email);
}

βœ… Good: Check index first, use existing

import { isEmail } from '@/utils/validate';

if (isEmail(userInput)) { ... }

Pattern 2: Slightly Different Versions

❌ Bad: Multiple date formatters with slight variations

// In file A
function formatDate(d: Date) { return d.toLocaleDateString(); }

// In file B
function displayDate(d: Date) { return d.toLocaleDateString('en-US'); }

// In file C
function showDate(d: Date) { return d.toLocaleDateString('en-US', { month: 'short' }); }

βœ… Good: One function with options

// utils/dates.ts
function formatDate(d: Date, options?: { locale?: string; format?: 'short' | 'long' }) {
  const locale = options?.locale ?? 'en-US';
  const formatOpts = options?.format === 'short'
    ? { month: 'short', day: 'numeric' }
    : { month: 'long', day: 'numeric', year: 'numeric' };
  return d.toLocaleDateString(locale, formatOpts);
}

Pattern 3: Inline Logic That Should Be Extracted

❌ Bad: Same validation logic scattered across files

// In signup.ts
if (!email || !email.includes('@') || email.length < 5) { ... }

// In profile.ts
if (!email || !email.includes('@') || email.length < 5) { ... }

// In invite.ts
if (!email || !email.includes('@') || email.length < 5) { ... }

βœ… Good: Extract once, import everywhere

// utils/validate.ts
export const isEmail = (email: string) =>
  email && email.includes('@') && email.length >= 5;

// Everywhere else
import { isEmail } from '@/utils/validate';
if (!isEmail(email)) { ... }

Periodic Audit

Run /audit-duplicates periodically to catch semantic overlap:

Audit Checklist

  • [ ] Utility functions: Any functions doing similar things?
  • [ ] API calls: Multiple ways to fetch same data?
  • [ ] Validation: Scattered inline validation logic?
  • [ ] Error handling: Inconsistent error patterns?
  • [ ] Components: Similar UI components that could merge?
  • [ ] Hooks: Custom hooks with overlapping logic?

Audit Output Format

## Duplicate Audit - [DATE]

### πŸ”΄ High Priority (Merge These)

1. **Date formatting** - 3 similar functions found
   - `formatDate()` in utils/dates.ts
   - `displayDate()` in components/Header.tsx
   - `showDate()` in pages/Profile.tsx
   - **Action:** Consolidate into utils/dates.ts

2. **Email validation** - Inline logic in 5 files
   - signup.ts:42
   - profile.ts:28
   - invite.ts:15
   - settings.ts:67
   - admin.ts:33
   - **Action:** Extract to utils/validate.ts

### 🟑 Medium Priority (Consider Merging)

1. **User fetching** - 2 different patterns
   - `fetchUser()` in api/users.ts
   - `getUser()` in services/user.ts
   - **Action:** Decide on one pattern

### 🟒 Low Priority (Monitor)

1. **Button components** - 3 variants exist
   - May be intentional for different use cases
   - **Action:** Document the differences

Vector DB Integration (Optional)

For large codebases (100+ files), add vector search:

Setup with ChromaDB

# scripts/index_codebase.py
import chromadb
from chromadb.utils import embedding_functions

# Initialize
client = chromadb.PersistentClient(path="./.chroma")
ef = embedding_functions.DefaultEmbeddingFunction()
collection = client.get_or_create_collection("code_index", embedding_function=ef)

# Index a function
collection.add(
    documents=["Formats a date into human-readable relative string like '2 days ago'"],
    metadatas=[{"function": "formatRelative", "file": "utils/dates.ts", "line": 32}],
    ids=["formatRelative"]
)

# Search before writing
results = collection.query(
    query_texts=["format date as relative time"],
    n_results=5
)
# Returns: formatRelative in utils/dates.ts - 0.92 similarity

Setup with LanceDB (Lighter)

# scripts/index_codebase.py
import lancedb

db = lancedb.connect("./.lancedb")

# Create table
data = [
    {"function": "formatRelative", "file": "utils/dates.ts", "description": "Formats date as relative time"},
    {"function": "isEmail", "file": "utils/validate.ts", "description": "Validates email format"},
]
table = db.create_table("code_index", data)

# Search
results = table.search("validate email address").limit(5).to_list()

When to Use Vector DB

Codebase Size Recommendation
< 50 files Markdown index only
50-200 files Markdown + periodic audit
200+ files Add vector DB
500+ files Vector DB essential

Claude Instructions

At Session Start

  1. Read CODE_INDEX.md if it exists
  2. Note the categories and key functions available
  3. Keep this context for the session

Before Writing New Code

  1. Pause and check: "Does something like this exist?"
  2. Search CODE_INDEX.md for similar capabilities
  3. If unsure, search the codebase: grep -r "functionName\|similar_term" src/
  4. Only create new if confirmed nothing suitable exists

After Writing New Code

  1. Immediately update CODE_INDEX.md
  2. Add file header if new file
  3. Add function docstring
  4. Commit index update with code

When User Says "Add X functionality"

Before implementing, let me check if we already have something similar...

[Checks CODE_INDEX.md]

Found: `existingFunction()` in utils/file.ts does something similar.
Options:
1. Use existing function as-is
2. Extend it with new capability
3. Create new (if truly different use case)

Which approach would you prefer?

Quick Reference

Update Index Command

/update-code-index

Audit Command

/audit-duplicates

File Header Template

/**
 * @file [Short description]
 * @description [What this file does]
 *
 * Key exports:
 * - function1() - [what it does]
 * - function2() - [what it does]
 */

Function Template

/**
 * [One line description of what it does]
 */
export function name(params): ReturnType {

Index Entry Template

| `functionName()` | path/file.ts:line | Does what in plain English | `(params)` |

# Supported AI Coding Agents

This skill is compatible with the SKILL.md standard and works with all major AI coding agents:

Learn more about the SKILL.md standard and how to use these skills with your preferred AI coding agent.