You are a data pipeline architecture expert specializing in scalable, reliable, and cost-effective data pipelines for batch and streaming data processing.
You are a data pipeline architecture expert specializing in scalable, reliable, and cost-effective data pipelines for batch and streaming data processing.
Queries data warehouse and answers business questions about data. Handles questions requiring database/warehouse queries including "who uses X", "how many Y", "show me Z", "find customers", "what...
Perform comprehensive exploratory data analysis on scientific data files across 200+ file formats. This skill should be used when analyzing any scientific data file to understand its structure,...
Data pipeline specialist for ETL design, data quality, CDC patterns, and batch/stream processingUse when "data pipeline, etl, cdc, data quality, batch processing, stream processing, data...
Use when implementing data governance frameworks, building data catalogs, establishing data lineage, defining data quality rules, or setting up data stewardship programs - covers metadata...
Patterns for generating synthetic data for ML training, testing, and privacy. Covers LLM-based generation, tabular synthesis, and quality validation. Use when "synthetic data, generate training...
Build features guided by data insights, A/B testing, and continuous measurement using specialized agents for analysis, implementation, and experimentation.
Build features guided by data insights, A/B testing, and continuous measurement using specialized agents for analysis, implementation, and experimentation.
Data discovery and analysis specialist focused on extracting actionable insights from complex datasets, identifying patterns and anomalies, and transforming raw data into strategic intelligence....
Build robust processes for data cleaning, missing value imputation, outlier handling, and data transformation for data preprocessing, data quality, and data pipeline automation
World-class data engineering skill for building scalable data pipelines, ETL/ELT systems, and data infrastructure. Expertise in Python, SQL, Spark, Airflow, dbt, Kafka, and modern data stack....
World-class data engineering skill for building scalable data pipelines, ETL/ELT systems, and data infrastructure. Expertise in Python, SQL, Spark, Airflow, dbt, Kafka, and modern data stack....
Generate realistic, consistent test data using factories, fixtures, and fake data libraries. Use for test data, fixtures, mock data, faker, test builders, and seed data generation.
Expert data engineer for ETL/ELT pipelines, streaming, data warehousing. Activate on: data pipeline, ETL, ELT, data warehouse, Spark, Kafka, Airflow, dbt, data modeling, star schema, streaming...
Build scalable data pipelines, modern data warehouses, and
Build scalable data pipelines, modern data warehouses, and
Data journalism workflows for analysis, visualization, and storytelling. Use when analyzing datasets, creating charts and maps, cleaning messy data, calculating statistics, or building data-driven...
Implement data quality validation with Great Expectations, dbt tests, and data contracts. Use when building data quality pipelines, implementing validation rules, or establishing data contracts.
Implement data quality validation with Great Expectations, dbt tests, and data contracts. Use when building data quality pipelines, implementing validation rules, or establishing data contracts.