Implement data quality validation with Great Expectations, dbt tests, and data contracts. Use when building data quality pipelines, implementing validation rules, or establishing data contracts.
Exploratory data analysis using ydata-profiling. Use when users upload .csv/.xlsx/.json/.parquet files or request "explore data", "analyze dataset", "EDA", "profile data". Generates interactive...
Exploratory data analysis using ydata-profiling. Use when users upload .csv/.xlsx/.json/.parquet files or request "explore data", "analyze dataset", "EDA", "profile data". Generates interactive...
Expert in data pipelines, ETL processes, and data infrastructure
Snowflake and DBT data engineering workflows covering warehouse sizing, query result caching optimization, STAR schema design, and DBT data quality testing patterns. Use when working with...
Expert in high-performance CSV processing, parsing, and data cleaning using Python, DuckDB, and command-line tools. Use when working with CSV files, cleaning data, transforming datasets, or...
Implement analytics, data analysis, and visualization best practices using Python, Jupyter, and modern data tools.
Automatically discover data pipeline and ETL skills when working with ETL. Activates for data development tasks.
Expert in business intelligence, SQL, data visualization, and translating data into actionable business insights.
Expert in getting reliable, typed outputs from LLMs. Covers JSON mode, function calling, Instructor library, Outlines for constrained generation, Pydantic validation, and response format...
Discover patterns, distributions, and relationships in data through visualization, summary statistics, and hypothesis generation for exploratory data analysis, data profiling, and initial insights
Strategic test data generation, management, and privacy compliance. Use when creating test data, handling PII, ensuring GDPR/CCPA compliance, or scaling data generation for realistic testing scenarios.
Query and download public cancer imaging data from NCI Imaging Data Commons using idc-index. Use for accessing large-scale radiology (CT, MR, PET) and pathology datasets for AI training or...
Implement GDPR-compliant data handling with consent management, data subject rights, and privacy by design. Use when building systems that process EU personal data, implementing privacy controls,...
Implement GDPR-compliant data handling with consent management, data subject rights, and privacy by design. Use when building systems that process EU personal data, implementing privacy controls,...
Implement GDPR-compliant data handling with consent management, data subject rights, and privacy by design. Use when building systems that process EU personal data, implementing privacy controls,...
Create safe, reversible database migration scripts with rollback capabilities, data validation, and zero-downtime deployments. Use when changing database schemas, migrating data between systems,...
Create or update database seed scripts for development and testing environments. Use when setting up test data, initializing development databases, creating demo environments, resetting to known...
Handles customer data responsibly by answering questions ABOUT data without ever seeing the data directly. Use when querying Redis, databases, logs, or any source containing customer information...
Search code by AST structure using ast-grep. Find semantic patterns like function calls, imports, class definitions instead of text patterns. Triggers on: find all calls to X, search for pattern,...