40 results (5.9ms) page 1 / 2
zechenzhangAGI / ai-research-skills-huggingface-accelerate exact

Simplest distributed training API. 4 lines to add distributed support to any PyTorch script. Unified API for DeepSpeed/FSDP/Megatron/DDP. Automatic device placement, mixed precision...

zechenzhangAGI / ai-research-skills-huggingface-tokenizers exact

Fast tokenizers optimized for research and production. Rust-based implementation tokenizes 1GB in <20 seconds. Supports BPE, WordPiece, and Unigram algorithms. Train custom vocabularies, track...

ovachiever / droid-tings-huggingface-accelerate exact

Simplest distributed training API. 4 lines to add distributed support to any PyTorch script. Unified API for DeepSpeed/FSDP/Megatron/DDP. Automatic device placement, mixed precision...

ovachiever / droid-tings-huggingface-tokenizers exact

Fast tokenizers optimized for research and production. Rust-based implementation tokenizes 1GB in <20 seconds. Supports BPE, WordPiece, and Unigram algorithms. Train custom vocabularies, track...

Ianfr13 / claude-code-plugins-huggingface-tokenizers exact

Fast tokenizers optimized for research and production. Rust-based implementation tokenizes 1GB in <20 seconds. Supports BPE, WordPiece, and Unigram algorithms. Train custom vocabularies, track...

mindrally / skills-transformers-huggingface exact

Expert guidance for working with Hugging Face Transformers library for NLP, computer vision, and multimodal AI tasks.

zechenzhangAGI / ai-research-skills-evaluating-llms-harness exact

Evaluates LLMs across 60+ academic benchmarks (MMLU, HumanEval, GSM8K, TruthfulQA, HellaSwag). Use when benchmarking model quality, comparing models, reporting academic results, or tracking...

zechenzhangAGI / ai-research-skills-hqq-quantization exact

Half-Quadratic Quantization for LLMs without calibration data. Use when quantizing models to 4/3/2-bit precision without needing calibration datasets, for fast quantization workflows, or when...

zechenzhangAGI / ai-research-skills-llamaguard exact

Meta's 7-8B specialized moderation model for LLM input/output filtering. 6 safety categories - violence/hate, sexual content, weapons, substances, self-harm, criminal planning. 94-95% accuracy....

zechenzhangAGI / ai-research-skills-stable-diffusion-image-generation exact

State-of-the-art text-to-image generation with Stable Diffusion models via HuggingFace Diffusers. Use when generating images from text prompts, performing image-to-image translation, inpainting,...

zechenzhangAGI / ai-research-skills-fine-tuning-with-trl exact

Fine-tune LLMs using reinforcement learning with TRL - SFT for instruction tuning, DPO for preference alignment, PPO/GRPO for reward optimization, and reward model training. Use when need RLHF,...

zechenzhangAGI / ai-research-skills-evaluating-code-models exact

Evaluates code generation models across HumanEval, MBPP, MultiPL-E, and 15+ benchmarks with pass@k metrics. Use when benchmarking code models, comparing coding abilities, testing multi-language...

zechenzhangAGI / ai-research-skills-ray-train exact

Distributed training orchestration across clusters. Scales PyTorch/TensorFlow/HuggingFace from laptop to 1000s of nodes. Built-in hyperparameter tuning with Ray Tune, fault tolerance, elastic...

zechenzhangAGI / ai-research-skills-quantizing-models-bitsandbytes exact

Quantizes LLMs to 8-bit or 4-bit for 50-75% memory reduction with minimal accuracy loss. Use when GPU memory is limited, need to fit larger models, or want faster inference. Supports INT8, NF4,...

zechenzhangAGI / ai-research-skills-peft-fine-tuning exact

Parameter-efficient fine-tuning for LLMs using LoRA, QLoRA, and 25+ methods. Use when fine-tuning large models (7B-70B) with limited GPU memory, when you need to train <1% of parameters with...

zechenzhangAGI / ai-research-skills-mamba-architecture exact

State-space model with O(n) complexity vs Transformers' O(nΒ²). 5Γ— faster inference, million-token sequences, no KV cache. Selective SSM with hardware-aware design. Mamba-1 (d_state=16) and Mamba-2...

zechenzhangAGI / ai-research-skills-moe-training exact

Train Mixture of Experts (MoE) models using DeepSpeed or HuggingFace. Use when training large-scale models with limited compute (5Γ— cost reduction vs dense models), implementing sparse...

ovachiever / droid-tings-evaluating-llms-harness exact

Evaluates LLMs across 60+ academic benchmarks (MMLU, HumanEval, GSM8K, TruthfulQA, HellaSwag). Use when benchmarking model quality, comparing models, reporting academic results, or tracking...

gptq 0.09
zechenzhangAGI / ai-research-skills-gptq exact

Post-training 4-bit quantization for LLMs with minimal accuracy loss. Use for deploying large models (70B, 405B) on consumer GPUs, when you need 4Γ— memory reduction with <2% perplexity...

zechenzhangAGI / ai-research-skills-gguf-quantization exact

GGUF format and llama.cpp quantization for efficient CPU/GPU inference. Use when deploying models on consumer hardware, Apple Silicon, or when needing flexible quantization from 2-8 bit without...