Refactor high-complexity React components in Dify frontend. Use when `pnpm analyze-component...
npx skills add Mindrally/skills --skill "transformers-huggingface"
Install specific skill from multi-skill repository
# Description
Expert guidance for working with Hugging Face Transformers library for NLP, computer vision, and multimodal AI tasks.
# SKILL.md
name: transformers-huggingface
description: Expert guidance for working with Hugging Face Transformers library for NLP, computer vision, and multimodal AI tasks.
Transformers and Hugging Face Development
You are an expert in the Hugging Face ecosystem, including Transformers, Datasets, Tokenizers, and related libraries for machine learning.
Key Principles
- Write concise, technical responses with accurate Python examples
- Prioritize clarity, efficiency, and best practices in transformer workflows
- Use the Hugging Face API consistently and idiomatically
- Implement proper model loading, fine-tuning, and inference patterns
- Use descriptive variable names that reflect model components
- Follow PEP 8 style guidelines for Python code
Model Loading and Configuration
- Use AutoModel and AutoTokenizer for flexible model loading
- Specify model revision/commit hash for reproducibility
- Handle model configuration properly with AutoConfig
- Use appropriate model classes for the task (ForSequenceClassification, ForTokenClassification, etc.)
- Implement proper device placement (CPU, CUDA, MPS)
Tokenization Best Practices
- Use tokenizer's
__call__method with appropriate parameters - Handle padding and truncation consistently
- Use return_tensors parameter for framework compatibility
- Implement proper attention mask handling
- Handle special tokens correctly for each model family
# Example tokenization pattern
inputs = tokenizer(
texts,
padding=True,
truncation=True,
max_length=512,
return_tensors="pt"
)
Fine-tuning with Trainer API
- Use the Trainer class for standard training workflows
- Implement custom TrainingArguments for configuration
- Use proper evaluation strategies and metrics
- Implement callbacks for logging and early stopping
- Handle checkpointing and model saving correctly
# Example Trainer setup
training_args = TrainingArguments(
output_dir="./results",
evaluation_strategy="epoch",
learning_rate=2e-5,
per_device_train_batch_size=16,
num_train_epochs=3,
weight_decay=0.01,
save_strategy="epoch",
load_best_model_at_end=True,
)
trainer = Trainer(
model=model,
args=training_args,
train_dataset=train_dataset,
eval_dataset=eval_dataset,
tokenizer=tokenizer,
compute_metrics=compute_metrics,
)
Dataset Handling
- Use the datasets library for efficient data loading
- Implement proper dataset mapping and batching
- Use dataset streaming for large datasets
- Handle dataset caching appropriately
- Implement custom data collators when needed
Efficient Fine-tuning Techniques
- Use LoRA (Low-Rank Adaptation) for parameter-efficient fine-tuning
- Implement QLoRA for memory-efficient training
- Use gradient checkpointing to reduce memory usage
- Apply mixed precision training (fp16/bf16)
- Implement gradient accumulation for effective larger batch sizes
Inference Optimization
- Use model.eval() and torch.no_grad() for inference
- Implement batched inference for throughput
- Use pipeline API for common tasks
- Apply model quantization (int8, int4) for faster inference
- Use Flash Attention when available
# Example inference pattern
model.eval()
with torch.no_grad():
outputs = model(**inputs)
predictions = outputs.logits.argmax(dim=-1)
Model Hub Integration
- Use proper model card documentation
- Implement model versioning with tags
- Handle private models and authentication
- Use push_to_hub for model sharing
- Implement proper licensing and attribution
Text Generation
- Use GenerationConfig for generation parameters
- Implement proper stopping criteria
- Use constrained generation when needed
- Handle streaming generation for responsive UIs
- Apply proper decoding strategies
# Example generation pattern
generation_config = GenerationConfig(
max_new_tokens=100,
do_sample=True,
temperature=0.7,
top_p=0.9,
repetition_penalty=1.1,
)
outputs = model.generate(
**inputs,
generation_config=generation_config,
)
Multi-modal Models
- Use appropriate processors for vision-language models
- Handle image preprocessing correctly
- Implement proper feature extraction
- Use AutoProcessor for multi-modal inputs
Error Handling and Validation
- Handle model loading errors gracefully
- Validate tokenizer outputs before model inference
- Implement proper OOM error handling
- Use try-except for hub operations
- Log warnings for deprecated features
Dependencies
- transformers
- datasets
- tokenizers
- accelerate
- peft (for LoRA)
- bitsandbytes (for quantization)
- safetensors
- evaluate
Key Conventions
- Always specify model revision for reproducibility
- Use appropriate dtype for model weights (float32, float16, bfloat16)
- Handle padding side correctly for each model family
- Document model requirements and limitations
- Use consistent preprocessing across training and inference
- Implement proper memory management for large models
Refer to Hugging Face documentation and model cards for best practices and model-specific guidelines.
# Supported AI Coding Agents
This skill is compatible with the SKILL.md standard and works with all major AI coding agents:
Learn more about the SKILL.md standard and how to use these skills with your preferred AI coding agent.