xgboost-best-practices

by @SellTheSun in AI & LLM

# Install this skill:

npx skills add SellTheSun/STS_Skills --skill "xgboost-best-practices"

Install specific skill from multi-skill repository

# Description

XGBoost machine learning best practices for training, tuning, and deploying gradient boosted models. Use when writing, reviewing, or implementing XGBoost models for classification, regression, or ranking tasks. Triggers on tasks involving XGBoost training, hyperparameter optimization, data preparation, model evaluation, or deployment.

# SKILL.md

name: xgboost-best-practices
description: XGBoost machine learning best practices for training, tuning, and deploying gradient boosted models. Use when writing, reviewing, or implementing XGBoost models for classification, regression, or ranking tasks. Triggers on tasks involving XGBoost training, hyperparameter optimization, data preparation, model evaluation, or deployment.
license: MIT
metadata:
author: xgboost-community
version: "1.0.0"

XGBoost Best Practices

Comprehensive optimization and best practices guide for XGBoost machine learning applications. Contains 60+ rules across 10 categories, prioritized by impact to guide automated code generation and model training workflows.

[!NOTE]
For the complete guide with all rules and code examples, read AGENTS.md.
This SKILL.md provides a quick reference; AGENTS.md contains detailed explanations and incorrect/correct code patterns for all 60 rules.

When to Apply

Reference these guidelines when:
- Preparing data for XGBoost training (DMatrix, categorical features, missing values)
- Configuring hyperparameters (learning rate, tree depth, regularization)
- Training models (early stopping, cross-validation, callbacks)
- Tuning hyperparameters (grid search, Optuna, overfitting diagnosis)
- Evaluating models (feature importance, SHAP, metrics)
- Persisting and deploying models (JSON format, version compatibility)
- Optimizing performance (GPU, distributed, external memory)
- Integrating with scikit-learn pipelines

Environment Context (Ask First)

[!IMPORTANT]
If training environment is unclear, ask: LOCAL vs CLOUD, OS, GPU vendor (NVIDIA/AMD/none), XGBoost version (1.x vs 2.x+).

GPU support guidance:
- NVIDIA CUDA: Full support (cloud and local)
- AMD ROCm: Limited, Linux-only, verify build
- AMD Windows: CPU training only (tree_method='hist')

Rule Categories by Priority

Priority	Category	Impact	Prefix
1	Data Preparation	CRITICAL	`data-`
2	General Parameters	CRITICAL	`param-`
3	Training	HIGH	`train-`
4	Hyperparameter Tuning	HIGH	`tune-`
5	Evaluation	MEDIUM	`eval-`
6	Model Persistence	HIGH	`persist-`
7	Deployment	MEDIUM	`deploy-`
8	Performance	MEDIUM	`perf-`
9	Scikit-Learn API	HIGH	`sklearn-`
10	Advanced Patterns	LOW	`advanced-`

Quick Reference

1. Data Preparation (CRITICAL)

data-dmatrix - Use DMatrix for Native API
data-missing-values - Handle Missing Values Explicitly
data-feature-engineering - Feature Scaling Usually Not Required
data-categorical - Enable Native Categorical Support
data-timeseries-split - Use Time-Based Cross-Validation
data-purged-cv - Use Purged/Embargo CV for Finance Labels
data-label-encoding - Encode Target Variables Correctly
data-weight-imbalance - Use Sample Weights for Imbalanced Data

2. General Parameters (CRITICAL)

param-learning-rate - Set Lower Learning Rate with More Trees
param-max-depth - Control Tree Depth to Prevent Overfitting
param-min-child-weight - Set Minimum Child Weight
param-gamma - Use Gamma for Minimum Split Loss
param-subsample - Enable Row Subsampling
param-colsample - Enable Column Subsampling
param-regularization - Apply L1/L2 Regularization
param-objective - Choose Correct Objective Function
param-tree-method - Select Appropriate Tree Method

3. Training (HIGH)

train-early-stopping - Always Use Early Stopping
train-cross-validation - Use Built-in Cross-Validation
train-evaluation-metric - Monitor Multiple Metrics
train-callbacks - Use Callbacks for Logging
train-watchlist - Monitor Training and Validation Loss
train-seed - Set Random Seed for Reproducibility

4. Hyperparameter Tuning (HIGH)

tune-grid-search - Start with Coarse Grid Search
tune-random-search - Use Random Search for Efficiency
tune-bayesian-optuna - Use Optuna for Bayesian Optimization
tune-overfitting - Diagnose Overfitting vs Underfitting
tune-parameter-order - Tune Parameters in Optimal Order
tune-scale-pos-weight - Handle Class Imbalance

5. Evaluation (MEDIUM)

eval-feature-importance - Use gain-based Feature Importance
eval-shap - Use SHAP for Model Interpretability
eval-confusion-matrix - Generate Confusion Matrix for Classification
eval-roc-auc - Evaluate with ROC-AUC for Binary
eval-regression-metrics - Use RMSE, MAE, R² for Regression

6. Model Persistence (HIGH)

persist-save-model - Save Models in JSON/UBJSON Format
persist-load-model - Load Models Correctly
persist-version-compat - Check Version Compatibility
persist-feature-names - Preserve Feature Names

7. Deployment (MEDIUM)

deploy-prediction - Use Correct Prediction Methods
deploy-batch-inference - Optimize Batch Predictions
deploy-iteration-range - Use iteration_range for Best Model
deploy-inference-config - Configure for Inference Performance

8. Performance (MEDIUM)

perf-gpu-training - Enable GPU Training (NVIDIA CUDA only)
perf-amd-cpu-optimization - Optimize for AMD Ryzen CPUs
perf-local-vs-cloud - Ask User About Training Environment
perf-distributed - Use Distributed Training for Large Data
perf-external-memory - Use External Memory for Very Large Data
perf-nthreads - Configure Thread Count
perf-quantile-dmatrix - Use QuantileDMatrix for hist

9. Scikit-Learn API (HIGH)

sklearn-xgbclassifier - Use XGBClassifier Correctly
sklearn-xgbregressor - Use XGBRegressor Correctly
sklearn-pipeline - Integrate with Sklearn Pipelines
sklearn-gridsearchcv - Use with GridSearchCV
sklearn-early-stopping - Enable Early Stopping in Sklearn API

10. Advanced Patterns (LOW)

advanced-custom-objective - Implement Custom Objective Functions
advanced-custom-metric - Implement Custom Evaluation Metrics
advanced-monotonic - Apply Monotonic Constraints
advanced-feature-interaction - Apply Feature Interaction Constraints
advanced-dart - Use DART Booster
advanced-random-forest - Use XGBoost as Random Forest

How to Use

Read individual rule files for detailed explanations and code examples:

rules/data-timeseries-split.md
rules/param-learning-rate.md
rules/train-early-stopping.md

Each rule file contains:
- Brief explanation of why it matters
- Incorrect code example with explanation
- Correct code example with explanation
- Additional context and references

Full Compiled Document

For the complete guide with all rules expanded: AGENTS.md

References

# Supported AI Coding Agents

This skill is compatible with the SKILL.md standard and works with all major AI coding agents:

⚡ Amp 🚀 Antigravity 🤖 Claude Code 🦀 Clawdbot 📝 Codex ▶️ Cursor 🤖 Droid 💎 Gemini CLI 🐙 GitHub Copilot 🪿 Goose 📊 Kilo Code 🔧 Kiro CLI 💻 OpenCode 🦘 Roo Code 🌲 Trae 🏄 Windsurf

Learn more about the SKILL.md standard and how to use these skills with your preferred AI coding agent.