Evaluates code generation models across HumanEval, MBPP, MultiPL-E, and 15+ benchmarks with pass@k metrics. Use when...
cat ~/신규
마켓플레이스에 추가된 최신 스킬 둘러보기
Half-Quadratic Quantization for LLMs without calibration data. Use when quantizing models to 4/3/2-bit precision...
Post-training 4-bit quantization for LLMs with minimal accuracy loss. Use for deploying large models (70B, 405B) on...
GGUF format and llama.cpp quantization for efficient CPU/GPU inference. Use when deploying models on consumer...
Optimizes transformer attention with Flash Attention for 2-4x speedup and 10-20x memory reduction. Use when...
Quantizes LLMs to 8-bit or 4-bit for 50-75% memory reduction with minimal accuracy loss. Use when GPU memory is...
Activation-aware weight quantization for 4-bit LLM compression with 3x speedup and minimal accuracy loss. Use when...
Multi-cloud orchestration for ML workloads with automatic cost optimization. Use when you need to run training or...
Serverless GPU cloud platform for running ML workloads. Use when you need on-demand GPU access without...
Reserved and on-demand GPU cloud instances for ML training and inference. Use when you need dedicated GPU instances...
Distributed training orchestration across clusters. Scales PyTorch/TensorFlow/HuggingFace from laptop to 1000s of...
High-level PyTorch framework with Trainer class, automatic distributed training (DDP/FSDP/DeepSpeed), callbacks...
Expert guidance for Fully Sharded Data Parallel training with PyTorch FSDP - parameter sharding, mixed precision,...
Trains large language models (2B-462B parameters) using NVIDIA Megatron-Core with advanced parallelism strategies....
Expert guidance for distributed training with DeepSpeed - ZeRO optimization stages, pipeline parallelism,...
Simplest distributed training API. 4 lines to add distributed support to any PyTorch script. Unified API for...
NVIDIA's runtime safety framework for LLM applications. Features jailbreak detection, input/output validation,...
Meta's 7-8B specialized moderation model for LLM input/output filtering. 6 safety categories - violence/hate, sexual...
Anthropic's method for training harmless AI through self-improvement. Two-phase approach - supervised learning with...
Scalable data processing for ML workloads. Streaming execution across CPU/GPU, supports Parquet/CSV/JSON/images....
GPU-accelerated data curation for LLM training. Supports text/image/video/audio. Features fuzzy deduplication (16×...
Provides guidance for mechanistic interpretability research using TransformerLens to inspect and manipulate...
Provides guidance for training and analyzing Sparse Autoencoders (SAEs) using SAELens to decompose neural network...
Provides guidance for performing causal interventions on PyTorch models using pyvene's declarative intervention...