williamzujkowski

Database Schema Designer

3
0
# Install this skill:
npx skills add williamzujkowski/cognitive-toolworks --skill "Database Schema Designer"

Install specific skill from multi-skill repository

# Description

Design normalized database schemas with ERDs, migration plans, and indexing strategies for relational and document databases

# SKILL.md


name: Database Schema Designer
slug: database-schema-designer
description: Design normalized database schemas with ERDs, migration plans, and indexing strategies for relational and document databases
capabilities:
- Entity-relationship modeling and normalization (1NF-3NF, BCNF)
- Schema migration planning with rollback strategies
- Index design and query optimization patterns
- Constraint modeling (FK, unique, check, triggers)
- Multi-database support (PostgreSQL, MySQL, MongoDB, DynamoDB)
inputs:
- business_requirements: Domain entities, relationships, and access patterns
- existing_schema: Optional current schema for evolution/refactoring
- database_type: Target database system (postgres, mysql, mongo, etc.)
- scale_requirements: Expected data volume and query patterns
outputs:
- erd_diagram: Mermaid ER diagram with entities and relationships
- ddl_scripts: CREATE TABLE statements with constraints and indexes
- migration_plan: Ordered migration steps with rollback procedures
- optimization_notes: Indexing strategy and query pattern recommendations
keywords:
- database
- schema
- ERD
- normalization
- migration
- SQL
- DDL
- indexes
- constraints
- data-modeling
version: 1.0.0
owner: cloud.gov OCS
license: Apache-2.0
security: Safe for design work; no credentials or production data
links:
- https://www.postgresql.org/docs/current/ddl.html
- https://dev.mysql.com/doc/refman/8.0/en/data-types.html
- https://docs.mongodb.com/manual/data-modeling/


Purpose & When-To-Use

Use this skill when you need to:
- Design a new database schema from business requirements
- Refactor an existing schema for better normalization or performance
- Plan a migration strategy between schema versions
- Optimize database structure for specific access patterns
- Choose appropriate data types, indexes, and constraints

Skip this skill if you only need query optimization (use database-optimization-analyzer) or simple CRUD operations.

Pre-Checks

  1. Verify time context: Compute NOW_ET = 2025-10-26T12:00:00-04:00 (accessed via NIST/time.gov semantics)
  2. Validate inputs:
  3. Business requirements describe entities, relationships, and cardinality
  4. Database type is specified (default: PostgreSQL)
  5. Scale requirements include estimated row counts and query frequency
  6. Check database documentation is current (accessed NOW_ET):
  7. PostgreSQL 16+ docs (https://www.postgresql.org/docs/current/, accessed 2025-10-26)
  8. MySQL 8.0+ docs (https://dev.mysql.com/doc/, accessed 2025-10-26)
  9. MongoDB 7.0+ docs (https://docs.mongodb.com/, accessed 2025-10-26)

Procedure

T1: Fast Path (≀2k tokens) - Simple Schema Design

For straightforward domains with 3-8 entities and clear relationships:

  1. Entity extraction (100 tokens):
  2. Identify nouns from requirements β†’ entities
  3. List attributes with data types
  4. Note primary keys (natural vs surrogate)

  5. Relationship modeling (200 tokens):

  6. Identify entity relationships (1:1, 1:N, N:M)
  7. Resolve N:M with junction tables
  8. Add foreign key constraints

  9. Quick normalization (150 tokens):

  10. Check 1NF: Atomic values, no repeating groups
  11. Check 2NF: No partial dependencies
  12. Check 3NF: No transitive dependencies

  13. Basic indexes (100 tokens):

  14. Primary key indexes (automatic)
  15. Foreign key indexes (recommended)
  16. Common query column indexes

  17. Output (≀1.5k tokens):

  18. Mermaid ERD
  19. DDL CREATE TABLE statements
  20. Basic index creation statements

T2: Standard Path (≀6k tokens) - Production-Ready Schema

For complex domains requiring optimization and migration planning:

  1. All T1 steps (550 tokens)

  2. Advanced normalization (300 tokens):

  3. Evaluate BCNF for complex functional dependencies
  4. Consider controlled denormalization for read performance
  5. Document normalization decisions and trade-offs

  6. Constraint design (400 tokens):

  7. Check constraints for data validation
  8. Unique constraints for business rules
  9. Triggers for complex integrity rules
  10. Cascade rules for foreign keys (ON DELETE, ON UPDATE)

  11. Index optimization (500 tokens):

  12. Composite indexes for multi-column queries
  13. Covering indexes for SELECT performance
  14. Partial indexes for filtered queries
  15. Full-text search indexes if applicable

  16. Migration planning (800 tokens):

  17. Version N β†’ N+1 migration steps
  18. Data backfill scripts for new columns
  19. Rollback procedures for each step
  20. Zero-downtime migration strategy (if required)

  21. Documentation (≀3.5k tokens):

  22. ERD with cardinality notation
  23. Complete DDL with comments
  24. Migration plan with ordering
  25. Index justification and query patterns

T3: Deep Dive (≀12k tokens) - Enterprise Schema with Partitioning

For large-scale systems requiring partitioning, sharding, or cross-database design:

  1. All T2 steps (≀6k tokens)

  2. Scalability design (1.5k tokens):

  3. Table partitioning strategy (range, list, hash)
  4. Sharding key selection and distribution
  5. Archive table design for historical data
  6. Read replicas and query routing

  7. Performance analysis (1k tokens):

  8. Query pattern analysis and index coverage
  9. EXPLAIN plan review for common queries
  10. Cardinality estimation and statistics
  11. Partition pruning verification

  12. Data lifecycle (800 tokens):

  13. Retention policies and TTL implementation
  14. Archive and purge procedures
  15. GDPR/compliance considerations (anonymization, deletion)

  16. Comprehensive documentation (≀2.7k tokens):

  17. Full ERD with physical and logical views
  18. DDL with partitioning and sharding
  19. Complete migration plan with testing steps
  20. Performance baseline and monitoring queries

Decision Rules

When to escalate complexity tier:
- T1 β†’ T2: More than 10 entities, OR migration from existing schema, OR explicit performance requirements
- T2 β†’ T3: More than 50 tables, OR partitioning needed, OR multi-region deployment, OR >100M rows expected

When to recommend denormalization:
- Read:write ratio > 100:1 AND query joins >3 tables
- Real-time analytics dashboards requiring <100ms response
- Document explicitly: "Controlled denormalization for performance: [justification]"

When to abort:
- Requirements lack entity definitions or relationships
- Database type unsupported (emit TODO: "Add support for [database]")
- Conflicting constraints detected (e.g., circular foreign keys)

Output Contract

Required fields:

erd_diagram: string  # Mermaid ER diagram syntax
ddl_scripts: string  # Complete DDL (CREATE TABLE, indexes, constraints)
migration_plan: array  # Ordered steps with up/down scripts
optimization_notes: string  # Index strategy and query patterns

Optional fields:

normalization_analysis: string  # 1NF-3NF evaluation
partitioning_strategy: string  # If T3 used
test_data_generator: string  # Sample INSERT statements

Format: JSON or YAML document, optionally with embedded SQL code blocks

Examples

-- E-commerce schema: User, Product, Order, OrderItem (T1 example)
CREATE TABLE users (
  user_id SERIAL PRIMARY KEY,
  email VARCHAR(255) UNIQUE NOT NULL,
  created_at TIMESTAMPTZ DEFAULT NOW()
);
CREATE TABLE products (
  product_id SERIAL PRIMARY KEY,
  name VARCHAR(255) NOT NULL,
  price DECIMAL(10,2) NOT NULL CHECK (price >= 0),
  stock INTEGER DEFAULT 0 CHECK (stock >= 0)
);
CREATE TABLE orders (
  order_id SERIAL PRIMARY KEY,
  user_id INTEGER NOT NULL REFERENCES users(user_id) ON DELETE RESTRICT,
  total DECIMAL(10,2) NOT NULL,
  created_at TIMESTAMPTZ DEFAULT NOW()
);
CREATE TABLE order_items (
  order_id INTEGER REFERENCES orders(order_id) ON DELETE CASCADE,
  product_id INTEGER REFERENCES products(product_id) ON DELETE RESTRICT,
  quantity INTEGER NOT NULL CHECK (quantity > 0),
  price_snapshot DECIMAL(10,2) NOT NULL,
  PRIMARY KEY (order_id, product_id)
);
CREATE INDEX idx_orders_user ON orders(user_id, created_at DESC);
CREATE INDEX idx_products_price ON products(price) WHERE stock > 0;

Quality Gates

Token budgets:
- T1: ≀2k tokens (simple 3-8 entity schemas)
- T2: ≀6k tokens (production schemas with migrations)
- T3: ≀12k tokens (enterprise with partitioning/sharding)

Safety checks:
- [ ] No hardcoded credentials or sensitive data in examples
- [ ] All foreign keys have ON DELETE/ON UPDATE clauses specified
- [ ] Indexes justified by query patterns (not speculative)
- [ ] Migration steps are reversible (rollback provided)

Validation:
- [ ] DDL is syntactically valid for target database
- [ ] ERD entities match DDL tables 1:1
- [ ] All 3NF violations documented with justification
- [ ] Examples use sample/synthetic data only

Determinism:
- Use SERIAL/BIGSERIAL for PostgreSQL auto-increment
- Use AUTO_INCREMENT for MySQL
- Document any database-specific features used

Resources

Official Documentation (accessed 2025-10-26):
- PostgreSQL DDL: https://www.postgresql.org/docs/current/ddl.html
- PostgreSQL Indexes: https://www.postgresql.org/docs/current/indexes.html
- MySQL Data Types: https://dev.mysql.com/doc/refman/8.0/en/data-types.html
- MongoDB Data Modeling: https://docs.mongodb.com/manual/data-modeling/

Best Practices (accessed 2025-10-26):
- Database Normalization Guide: https://www.sqlshack.com/database-normalization-process/
- Index Design Patterns: https://use-the-index-luke.com/

Tools:
- ERD visualization: Mermaid (https://mermaid.js.org/syntax/entityRelationshipDiagram.html)
- Schema diff tools: migra, sqldiff, liquibase
- Migration frameworks: Flyway, Alembic, Liquibase

Related Skills:
- database-migration-generator - Generate migration scripts from schema changes
- database-optimization-analyzer - Analyze and optimize existing schemas
- data-pipeline-designer - Design ETL pipelines for data movement

# Supported AI Coding Agents

This skill is compatible with the SKILL.md standard and works with all major AI coding agents:

Learn more about the SKILL.md standard and how to use these skills with your preferred AI coding agent.