Refactor high-complexity React components in Dify frontend. Use when `pnpm analyze-component...
npx skills add 404kidwiz/claude-supercode-skills --skill "mlops-engineer"
Install specific skill from multi-skill repository
# Description
Expert in Machine Learning Operations bridging data science and DevOps. Use when building ML pipelines, model versioning, feature stores, or production ML serving. Triggers include "MLOps", "ML pipeline", "model deployment", "feature store", "model versioning", "ML monitoring", "Kubeflow", "MLflow".
# SKILL.md
name: mlops-engineer
description: Expert in Machine Learning Operations bridging data science and DevOps. Use when building ML pipelines, model versioning, feature stores, or production ML serving. Triggers include "MLOps", "ML pipeline", "model deployment", "feature store", "model versioning", "ML monitoring", "Kubeflow", "MLflow".
MLOps Engineer
Purpose
Provides expertise in Machine Learning Operations, bridging data science and DevOps practices. Specializes in end-to-end ML lifecycles from training pipelines to production serving, model versioning, and monitoring.
When to Use
- Building ML training and serving pipelines
- Implementing model versioning and registry
- Setting up feature stores
- Deploying models to production
- Monitoring model performance and drift
- Automating ML workflows (CI/CD for ML)
- Implementing A/B testing for models
- Managing experiment tracking
Quick Start
Invoke this skill when:
- Building ML pipelines and workflows
- Deploying models to production
- Setting up model versioning and registry
- Implementing feature stores
- Monitoring production ML systems
Do NOT invoke when:
- Model development and training → use /ml-engineer
- Data pipeline ETL → use /data-engineer
- Kubernetes infrastructure → use /kubernetes-specialist
- General CI/CD without ML → use /devops-engineer
Decision Framework
ML Lifecycle Stage?
├── Experimentation
│ └── MLflow/Weights & Biases for tracking
├── Training Pipeline
│ └── Kubeflow/Airflow/Vertex AI
├── Model Registry
│ └── MLflow Registry/Vertex Model Registry
├── Serving
│ ├── Batch → Spark/Dataflow
│ └── Real-time → TF Serving/Seldon/KServe
└── Monitoring
└── Evidently/Fiddler/custom metrics
Core Workflows
1. ML Pipeline Setup
- Define pipeline stages (data prep, training, eval)
- Choose orchestrator (Kubeflow, Airflow, Vertex)
- Containerize each pipeline step
- Implement artifact storage
- Add experiment tracking
- Configure automated retraining triggers
2. Model Deployment
- Register model in model registry
- Build serving container
- Deploy to serving infrastructure
- Configure autoscaling
- Implement canary/shadow deployment
- Set up monitoring and alerts
3. Model Monitoring
- Define key metrics (latency, throughput, accuracy)
- Implement data drift detection
- Set up prediction monitoring
- Create alerting thresholds
- Build dashboards for visibility
- Automate retraining triggers
Best Practices
- Version everything: code, data, models, configs
- Use feature stores for consistency between training and serving
- Implement CI/CD specifically designed for ML workflows
- Monitor data drift and model performance continuously
- Use canary deployments for model rollouts
- Keep training and serving environments consistent
Anti-Patterns
| Anti-Pattern | Problem | Correct Approach |
|---|---|---|
| Manual deployments | Error-prone, slow | Automated ML CI/CD |
| Training-serving skew | Prediction errors | Feature stores |
| No model versioning | Can't reproduce or rollback | Model registry |
| Ignoring data drift | Silent degradation | Continuous monitoring |
| Notebook-to-production | Unmaintainable | Proper pipeline code |
# Supported AI Coding Agents
This skill is compatible with the SKILL.md standard and works with all major AI coding agents:
Learn more about the SKILL.md standard and how to use these skills with your preferred AI coding agent.