Use when building any system that involves AI/model calls - integrates with brainstorming, planning, and TDD to ensure model agency over hardcoded rules
Use when asked to compare multiple ML models, perform cross-validation, evaluate metrics, or select the best model for a classification/regression task.
Bayesian modeling with PyMC. Build hierarchical models, MCMC (NUTS), variational inference, LOO/WAIC comparison, posterior checks, for probabilistic programming and inference.
Bayesian modeling with PyMC. Build hierarchical models, MCMC (NUTS), variational inference, LOO/WAIC comparison, posterior checks, for probabilistic programming and inference.
Bayesian modeling with PyMC. Build hierarchical models, MCMC (NUTS), variational inference, LOO/WAIC comparison, posterior checks, for probabilistic programming and inference.
Evaluate and compare ML model performance with rigorous testing methodologies
This skill should be used when users want to train or fine-tune language models using TRL (Transformer Reinforcement Learning) on Hugging Face Jobs infrastructure. Covers SFT, DPO, GRPO and reward...
Expert guidance on validating, optimizing, and ensuring quality of Mapbox styles through validation, accessibility checks, and optimization. Use when preparing styles for production, debugging...
Expert guidance on validating, optimizing, and ensuring quality of Mapbox styles through validation, accessibility checks, and optimization. Use when preparing styles for production, debugging...
Latest AI models reference - Claude, OpenAI, Gemini, Eleven Labs, Replicate
Common style patterns, layer configurations, and recipes for typical mapping scenarios including restaurant finders, real estate, data visualization, navigation, and more. Use when implementing...
Common style patterns, layer configurations, and recipes for typical mapping scenarios including restaurant finders, real estate, data visualization, navigation, and more. Use when implementing...
Use this skill when users need to stress test their business model, identify scale limitations, find bottlenecks, determine if they're trading time for money, or evaluate unit economics. Activates...
Evaluates code generation models across HumanEval, MBPP, MultiPL-E, and 15+ benchmarks with pass@k metrics. Use when benchmarking code models, comparing coding abilities, testing multi-language...
Design and implement agent-based models (ABM) for simulating complex systems with emergent behavior from individual agent interactions. Use when "agent-based, multi-agent, emergent behavior, swarm...
Design data models with Pydantic schemas, comprehensive validation rules,
Build revenue projection models with driver-based forecasting, scenario analysis, and pricing optimization
Monitor model performance, detect data drift, concept drift, and anomalies in production using Prometheus, Grafana, and MLflow
Build binary and multiclass classification models using logistic regression, decision trees, and ensemble methods for categorical prediction and classification
Automatic model selection based on task type. Routes planning to Opus, coding to Sonnet, simple tasks to Haiku. Optimizes cost and quality automatically.