Runs LLM inference on CPU, Apple Silicon, and consumer GPUs without NVIDIA hardware. Use for edge deployment,...
AI & LLM
LLM integrations, prompt engineering, and AI orchestration
Expert guidance for fine-tuning LLMs with LLaMA-Factory - WebUI no-code, 100+ models, 2/3/4/5/6/8-bit QLoRA,...
Meta's 7-8B specialized moderation model for LLM input/output filtering. 6 safety categories - violence/hate, sexual...
Data framework for building LLM applications with RAG. Specializes in document ingestion (300+ connectors),...
Large Language and Vision Assistant. Enables visual instruction tuning and image-based conversations. Combines CLIP...
Implement comprehensive evaluation strategies for LLM applications using automated metrics, human feedback, and...
Evaluates LLMs across 60+ academic benchmarks (MMLU, HumanEval, GSM8K, TruthfulQA, HellaSwag). Use when benchmarking...
Extend context windows of transformer models using RoPE, YaRN, ALiBi, and position interpolation techniques. Use...
State-space model with O(n) complexity vs Transformers' O(n²). 5× faster inference, million-token sequences, no KV...
Multi-channel demand generation, paid media optimization, SEO strategy, and partnership programs for Series A+...
Mass spectrometry analysis. Process mzML/MGF/MSP, spectral similarity (cosine, modified cosine), metadata...
Guide for creating high-quality MCP (Model Context Protocol) servers that enable LLMs to interact with external...
This skill should be used when the user asks to "add MCP server", "integrate MCP", "configure MCP in plugin", "use...
Medicinal chemistry filters. Apply drug-likeness rules (Lipinski, Veber), PAINS filters, structural alerts,...
Trains large language models (2B-462B parameters) using NVIDIA Megatron-Core with advanced parallelism strategies....
Track ML experiments, manage model registry with versioning, deploy models to production, and reproduce experiments...
Run Python code in the cloud with serverless containers, GPUs, and autoscaling. Use when deploying ML models,...
Merge multiple fine-tuned models using mergekit to combine capabilities without retraining. Use when creating...
Reduce LLM size and accelerate inference using pruning techniques like Wanda and SparseGPT. Use when compressing...
Train Mixture of Experts (MoE) models using DeepSpeed or HuggingFace. Use when training large-scale models with...
Molecular featurization for ML (100+ featurizers). ECFP, MACCS, descriptors, pretrained models (ChemBERTa), convert...
Educational GPT implementation in ~300 lines. Reproduces GPT-2 (124M) on OpenWebText. Clean, hackable code for...
GPU-accelerated data curation for LLM training. Supports text/image/video/audio. Features fuzzy deduplication (16×...
NVIDIA's runtime safety framework for LLM applications. Features jailbreak detection, input/output validation,...