williamzujkowski

Database Migration Script Generator

3
0
# Install this skill:
npx skills add williamzujkowski/cognitive-toolworks --skill "Database Migration Script Generator"

Install specific skill from multi-skill repository

# Description

Generate database migration scripts for Liquibase, Flyway, Alembic with rollback safety, data preservation, and zero-downtime patterns

# SKILL.md


name: "Database Migration Script Generator"
slug: "database-migration-generator"
description: "Generate database migration scripts for Liquibase, Flyway, Alembic with rollback safety, data preservation, and zero-downtime patterns"
capabilities:
- Schema migrations (DDL: tables, columns, indexes, constraints)
- Data migrations with backfill and transformation logic
- Zero-downtime deployment patterns (expand/contract, online DDL)
- Rollback script generation with safety validation
- Migration testing and validation scripts
- Database-specific optimization (PostgreSQL, MySQL, SQL Server, Oracle)
inputs:
- migration_tool: "liquibase | flyway | alembic (string)"
- database: "postgresql | mysql | sqlserver | oracle (string)"
- migration_type: "schema | data | hybrid (string)"
- downtime_allowed: "true | false (boolean)"
- migration_description: "Human-readable description of migration intent (string)"
outputs:
- migration_script: "Forward migration code in tool-specific format (string)"
- rollback_script: "Reverse migration code with safety checks (string)"
- validation_tests: "Test queries to validate migration success (array)"
- deployment_guide: "Step-by-step rollout instructions (string)"
keywords:
- database-migration
- liquibase
- flyway
- alembic
- schema-migration
- zero-downtime
- rollback
- ddl
- postgresql
- mysql
version: "1.0.0"
owner: "cognitive-toolworks"
license: "MIT"
security: "Public; no secrets or PII; safe for open repositories"
links:
- https://docs.liquibase.com/
- https://flywaydb.org/documentation/
- https://alembic.sqlalchemy.org/
- https://www.postgresql.org/docs/current/sql-altertable.html
- https://dev.mysql.com/doc/refman/8.0/en/online-ddl.html


Purpose & When-To-Use

Trigger conditions:
- Evolving database schema for application updates
- Performing data migrations with business logic transformations
- Implementing zero-downtime deployments requiring schema changes
- Standardizing migration workflows across teams
- Migrating between database versions or platforms
- Adding indexes, constraints, or partitioning to existing tables

Not for:
- Initial database schema creation (use ORM models or DDL scripts)
- One-off data fixes (use direct SQL with transaction safety)
- Database backups or recovery operations
- Cross-database data replication (use ETL tools)


Pre-Checks

Time normalization:
- Compute NOW_ET using NIST/time.gov semantics (America/New_York, ISO-8601)
- Use NOW_ET for all citation access dates

Input validation:
- migration_tool must be one of: liquibase, flyway, alembic
- database must be one of: postgresql, mysql, sqlserver, oracle
- migration_type must be one of: schema, data, hybrid
- downtime_allowed must be boolean (true/false)
- migration_description must be non-empty and descriptive

Source freshness:
- Liquibase docs must be accessible accessed 2025-10-26T03:51:54-04:00
- Flyway docs must be accessible accessed 2025-10-26T03:51:54-04:00
- Alembic docs must be accessible accessed 2025-10-26T03:51:54-04:00
- PostgreSQL online DDL docs must be accessible accessed 2025-10-26T03:51:54-04:00
- MySQL online DDL docs must be accessible accessed 2025-10-26T03:51:54-04:00


Procedure

T1: Basic Schema Migration (≤2k tokens)

Fast path for simple DDL changes:

  1. Schema Change Identification
  2. Add/drop columns (with default values to avoid table rewrites)
  3. Create/drop tables
  4. Add/drop simple indexes (non-unique, single column)
  5. Add/drop NOT NULL constraints (with validation)

  6. Tool-Specific Migration Format

Liquibase (XML/YAML) accessed 2025-10-26T03:51:54-04:00
xml <changeSet id="add-email-column" author="migration-generator"> <addColumn tableName="users"> <column name="email" type="VARCHAR(255)"/> </addColumn> <rollback> <dropColumn tableName="users" columnName="email"/> </rollback> </changeSet>

Flyway (SQL) accessed 2025-10-26T03:51:54-04:00
sql -- V1__add_email_column.sql ALTER TABLE users ADD COLUMN email VARCHAR(255);

Alembic (Python) accessed 2025-10-26T03:51:54-04:00
```python
def upgrade():
op.add_column('users', sa.Column('email', sa.String(255)))

def downgrade():
op.drop_column('users', 'email')
```

  1. Basic Rollback Script
  2. Generate inverse operation for each forward change
  3. Add rollback validation comments
  4. Include manual rollback steps if auto-rollback unsafe

Decision: If simple schema change without data → STOP at T1; otherwise proceed to T2.


T2: Data Migrations with Rollback (≤6k tokens)

Extended migrations with data transformations:

  1. Data Migration Patterns accessed 2025-10-26T03:51:54-04:00

Backfill Existing Data
python # Alembic: Backfill default values for new column def upgrade(): op.add_column('users', sa.Column('status', sa.String(20), nullable=True)) # Backfill existing rows op.execute("UPDATE users SET status = 'active' WHERE status IS NULL") # Make NOT NULL after backfill op.alter_column('users', 'status', nullable=False)

Data Transformation
python # Alembic: Split full_name into first_name and last_name def upgrade(): op.add_column('users', sa.Column('first_name', sa.String(100))) op.add_column('users', sa.Column('last_name', sa.String(100))) # Transform data connection = op.get_bind() users = connection.execute("SELECT id, full_name FROM users").fetchall() for user_id, full_name in users: parts = full_name.split(' ', 1) first = parts[0] last = parts[1] if len(parts) > 1 else '' connection.execute( "UPDATE users SET first_name = %s, last_name = %s WHERE id = %s", (first, last, user_id) ) op.drop_column('users', 'full_name')

  1. Rollback Safety Patterns
  2. Idempotency: Ensure migrations can run multiple times safely
  3. Checkpointing: Add validation queries before destructive operations
  4. Backup Triggers: Create temporary backup tables for data migrations
  5. Dry-Run Mode: Include commented-out SELECT statements to preview changes

  6. Validation Tests accessed 2025-10-26T03:51:54-04:00
    sql -- Post-migration validation SELECT COUNT(*) FROM users WHERE email IS NULL; -- Should be 0 SELECT COUNT(*) FROM users WHERE status NOT IN ('active', 'inactive'); -- Should be 0

  7. Database-Specific Considerations

PostgreSQL accessed 2025-10-26T03:51:54-04:00
- Use ALTER TABLE ... SET NOT NULL with CHECK constraint first
- Prefer CONCURRENTLY for index creation (zero-downtime)
- Use pg_stat_progress_create_index to monitor long operations

MySQL accessed 2025-10-26T03:51:54-04:00
- Check online DDL support: ALGORITHM=INPLACE, LOCK=NONE
- Avoid ALTER TABLE that requires table copy (pre-8.0)
- Use pt-online-schema-change for large tables (Percona Toolkit)

SQL Server
- Use WITH (ONLINE = ON) for index operations
- Leverage SSMS execution plan analysis
- Consider SCHEMA_ONLY copies for large data migrations


T3: Zero-Downtime Patterns (≤12k tokens)

Advanced patterns for production systems:

  1. Expand/Contract Pattern accessed 2025-10-26T03:51:54-04:00

Phase 1: Expand (Add new schema)
python # Migration 001: Add new column, keep old column def upgrade(): op.add_column('users', sa.Column('email_new', sa.String(255))) # Trigger to sync old → new during transition op.execute(""" CREATE TRIGGER sync_email_new BEFORE UPDATE ON users FOR EACH ROW BEGIN SET NEW.email_new = NEW.email; END; """)

Phase 2: Migrate Data
python # Migration 002: Backfill new column def upgrade(): op.execute("UPDATE users SET email_new = email WHERE email_new IS NULL")

Phase 3: Contract (Remove old schema)
python # Migration 003: Drop old column (after application updated) def upgrade(): op.execute("DROP TRIGGER IF EXISTS sync_email_new") op.drop_column('users', 'email') op.alter_column('users', 'email_new', new_column_name='email')

  1. Online Index Creation accessed 2025-10-26T03:51:54-04:00

PostgreSQL CONCURRENTLY
```sql
-- Flyway: V5__add_email_index.sql
CREATE INDEX CONCURRENTLY idx_users_email ON users(email);

-- Validation
SELECT schemaname, tablename, indexname, indexdef
FROM pg_indexes
WHERE indexname = 'idx_users_email';
```

MySQL Online DDL
sql -- Flyway: V6__add_composite_index.sql ALTER TABLE users ADD INDEX idx_email_status (email, status) ALGORITHM=INPLACE, LOCK=NONE;

  1. Shadow Table Pattern (for complex transformations)
    python # Migration 010: Create shadow table with new schema def upgrade(): op.create_table( 'users_new', sa.Column('id', sa.Integer, primary_key=True), sa.Column('email', sa.String(255), nullable=False, index=True), sa.Column('status', sa.String(20), nullable=False) ) # Stream data from old → new table op.execute(""" INSERT INTO users_new (id, email, status) SELECT id, email, COALESCE(status, 'active') FROM users """) # Atomic rename (downtime: milliseconds) op.rename_table('users', 'users_old') op.rename_table('users_new', 'users')

  2. Deployment Guide Template
    ```markdown
    ## Deployment Steps

### Pre-Migration
1. Verify database backup completed (last 24h)
2. Check application connection pool settings (timeout ≥ 30s)
3. Review query performance baseline (pg_stat_statements)

### Migration Execution
1. Run migration in transaction (if supported)
2. Monitor lock waits: SELECT * FROM pg_locks WHERE NOT granted
3. Validate row counts: SELECT COUNT(*) FROM users

### Post-Migration
1. Run ANALYZE to update statistics
2. Verify application logs (no constraint violations)
3. Monitor query performance (compare to baseline)

### Rollback Procedure (if needed)
1. Stop application traffic (or use feature flag)
2. Run rollback script: flyway undo or alembic downgrade -1
3. Verify data integrity: SELECT * FROM users LIMIT 10
4. Restore from backup if rollback fails
```

  1. Blue-Green Database Migrations accessed 2025-10-26T03:51:54-04:00
  2. Duplicate database instance (Blue = old schema, Green = new schema)
  3. Run migrations on Green instance
  4. Dual-write pattern during transition (application writes to both)
  5. Cutover: Update connection string to Green
  6. Validation period: Keep Blue online for 24-48h

Decision Rules

Migration Tool Selection:
- Liquibase: Best for multi-database support, XML/YAML declarative changes, complex rollback
- Flyway: Best for SQL-first teams, simple versioning, Java/Spring ecosystems
- Alembic: Best for Python applications using SQLAlchemy, programmatic migrations

Migration Strategy by Downtime Allowance:
- downtime_allowed = true: Use direct ALTER TABLE, faster execution, simpler scripts
- downtime_allowed = false: Use expand/contract, CONCURRENTLY, shadow tables, longer timeline

Database-Specific Patterns:
- PostgreSQL: Prefer CONCURRENTLY for indexes, use CHECK constraints before NOT NULL
- MySQL: Validate ALGORITHM=INPLACE support, use pt-online-schema-change for InnoDB
- SQL Server: Use ONLINE=ON, consider columnstore indexes for analytics workloads
- Oracle: Use Oracle Data Redefinition (DBMS_REDEFINITION) for zero-downtime

Abort Conditions:
- Invalid tool/database combination → error "Tool X does not support database Y"
- Destructive operation without rollback → error "Cannot generate safe rollback for DROP TABLE"
- Zero-downtime requested for non-supported operation → error "Zero-downtime not possible for operation X"

Data Preservation Checks:
- Dropping columns: Warn if column contains non-NULL data
- Changing types: Validate data fits in new type (VARCHAR(50) → VARCHAR(20))
- Adding NOT NULL: Require default value or backfill strategy


Output Contract

Schema (JSON):

{
  "migration_tool": "liquibase | flyway | alembic",
  "database": "postgresql | mysql | sqlserver | oracle",
  "migration_type": "schema | data | hybrid",
  "downtime_allowed": "boolean",
  "migration_script": {
    "filename": "string (e.g., V5__add_email_column.sql)",
    "content": "string (tool-specific migration code)"
  },
  "rollback_script": {
    "filename": "string (e.g., U5__undo_email_column.sql)",
    "content": "string (inverse migration code)",
    "manual_steps": ["string (if auto-rollback unsafe)"]
  },
  "validation_tests": [
    {
      "description": "string",
      "query": "string (SQL validation query)",
      "expected_result": "string"
    }
  ],
  "deployment_guide": {
    "pre_migration_steps": ["string"],
    "execution_steps": ["string"],
    "post_migration_steps": ["string"],
    "rollback_procedure": ["string"],
    "estimated_duration": "string (e.g., '5 minutes', '2 hours')"
  },
  "warnings": ["string (potential issues or breaking changes)"],
  "timestamp": "ISO-8601 string (NOW_ET)"
}

Required Fields:
- migration_tool, database, migration_type, downtime_allowed, migration_script, rollback_script, validation_tests, deployment_guide, timestamp

Safety Guarantees:
- All DDL changes must have explicit rollback (or manual rollback steps)
- Data migrations must include row count validation
- Zero-downtime migrations must specify lock duration estimates


Examples

Example 1: Simple Column Addition (Alembic + PostgreSQL)

"""Add email column to users table with NOT NULL constraint

Revision ID: a1b2c3d4e5f6
Revises: previous_revision
Create Date: 2025-10-26 03:51:54.000000

"""
from alembic import op
import sqlalchemy as sa

def upgrade():
    # Add column as nullable first
    op.add_column('users', sa.Column('email', sa.String(255), nullable=True))

    # Backfill with placeholder (application will update)
    op.execute("UPDATE users SET email = CONCAT('user', id, '@example.com') WHERE email IS NULL")

    # Add NOT NULL constraint
    op.alter_column('users', 'email', nullable=False)

    # Add index for performance
    op.create_index('idx_users_email', 'users', ['email'], unique=True)

def downgrade():
    op.drop_index('idx_users_email', table_name='users')
    op.drop_column('users', 'email')

Quality Gates

Token Budgets:
- T1: ≤2k tokens (simple schema change, basic rollback)
- T2: ≤6k tokens (data migration, validation tests, database-specific optimizations)
- T3: ≤12k tokens (zero-downtime patterns, deployment guide, multi-phase migrations)

Safety:
- No plaintext credentials in migration scripts (use environment variables)
- All destructive operations require explicit confirmation comments
- Rollback scripts tested against sample data

Auditability:
- Migration IDs/versions follow tool conventions (Flyway: V1__description.sql, Alembic: revision IDs)
- All migrations include author, timestamp, and description
- Database-specific syntax validated against official documentation

Determinism:
- Same inputs → identical migration scripts
- Idempotent migrations (can run multiple times safely)
- Predictable rollback behavior

Performance:
- Estimate lock duration for DDL operations
- Include EXPLAIN ANALYZE for data migrations affecting >10k rows
- Recommend batch size for large table transformations (e.g., 1000 rows/batch)


Resources

Official Documentation (accessed 2025-10-26T03:51:54-04:00):
1. Liquibase Change Types - DDL/DML operations
2. Flyway SQL Migrations - Versioned migrations
3. Alembic Operations Reference - Python migration API
4. PostgreSQL ALTER TABLE - DDL syntax
5. MySQL Online DDL - Zero-downtime operations
6. SQL Server Online Index Operations - Online DDL

Migration Patterns:
- Expand/Contract Pattern - Zero-downtime schema evolution
- Blue-Green Deployments - Database migration strategies
- Database Refactoring - Catalog of database refactoring patterns

Best Practices:
- Flyway Best Practices - Migration versioning and naming
- Alembic Tutorial - Auto-generate vs manual migrations
- PostgreSQL Wiki: Don't Do This - Anti-patterns to avoid

Tool Comparisons:
- Liquibase vs Flyway - Feature comparison
- Schema Migration Tools Comparison - Multi-tool benchmarks

# Supported AI Coding Agents

This skill is compatible with the SKILL.md standard and works with all major AI coding agents:

Learn more about the SKILL.md standard and how to use these skills with your preferred AI coding agent.