14574 results (117.2ms) page 36 / 729
zechenzhangAGI / ai-research-skills-speculative-decoding exact

Accelerate LLM inference using speculative decoding, Medusa multiple heads, and lookahead decoding techniques. Use when optimizing inference speed (1.5-3.6× speedup), reducing latency for...

zechenzhangAGI / ai-research-skills-outlines exact

Guarantee valid JSON/XML/code structure during generation, use Pydantic models for type-safe outputs, support local models (Transformers, vLLM), and maximize inference speed with Outlines -...

zechenzhangAGI / ai-research-skills-skypilot-multi-cloud-orchestration exact

Multi-cloud orchestration for ML workloads with automatic cost optimization. Use when you need to run training or batch jobs across multiple clouds, leverage spot instances with auto-recovery, or...

zechenzhangAGI / ai-research-skills-optimizing-attention-flash exact

Optimizes transformer attention with Flash Attention for 2-4x speedup and 10-20x memory reduction. Use when training/running transformers with long sequences (>512 tokens), encountering GPU memory...

oaustegard / claude-skills-check-tools exact

Validates development tool installations across Python, Node.js, Java, Go, Rust, C/C++, Git, and system utilities. Use when verifying environments or troubleshooting dependencies.

zechenzhangAGI / ai-research-skills-axolotl exact

Expert guidance for fine-tuning LLMs with Axolotl - YAML configs, 100+ models, LoRA/QLoRA, DPO/KTO/ORPO/GRPO, multimodal support

stotihv / skills-worker exact

Execute beads autonomously within a track. Handles bead-to-bead context persistence via Agent Mail, uses preferred tools from AGENTS.md, and reports progress to orchestrator.

zechenzhangAGI / ai-research-skills-nemo-guardrails exact

NVIDIA's runtime safety framework for LLM applications. Features jailbreak detection, input/output validation, fact-checking, hallucination detection, PII filtering, toxicity detection. Uses...

zechenzhangAGI / ai-research-skills-mamba-architecture exact

State-space model with O(n) complexity vs Transformers' O(n²). 5× faster inference, million-token sequences, no KV cache. Selective SSM with hardware-aware design. Mamba-1 (d_state=16) and Mamba-2...

zechenzhangAGI / ai-research-skills-nemo-curator exact

GPU-accelerated data curation for LLM training. Supports text/image/video/audio. Features fuzzy deduplication (16× faster), quality filtering (30+ heuristics), semantic deduplication, PII...

zechenzhangAGI / ai-research-skills-torchforge-rl-training exact

Provides guidance for PyTorch-native agentic RL using torchforge, Meta's library separating infra from algorithms. Use when you want clean RL abstractions, easy algorithm experimentation, or...

stotihv / skills-issue-resolution exact

Systematically diagnose and fix bugs through triage, reproduction, root cause analysis, and verified fixes. Use when resolving bugs, errors, failing tests, or investigating unexpected behavior.

zechenzhangAGI / ai-research-skills-evaluating-llms-harness exact

Evaluates LLMs across 60+ academic benchmarks (MMLU, HumanEval, GSM8K, TruthfulQA, HellaSwag). Use when benchmarking model quality, comparing models, reporting academic results, or tracking...

zechenzhangAGI / ai-research-skills-model-merging exact

Merge multiple fine-tuned models using mergekit to combine capabilities without retraining. Use when creating specialized models by blending domain-specific expertise (math + coding + chat),...

blindlove200 / sub-agents-skills-sub-agents exact

Execute external CLI AIs as isolated sub-agents for task delegation, parallel processing, and context separation. Use when delegating complex multi-step tasks, running parallel investigations,...

Interstellar-code / claud-skills-template-skill exact

Replace with description of the skill and when Claude should use it.

oaustegard / claude-skills-inspecting-skills exact

Discovers and indexes Python code in skills, enabling cross-skill imports. Use when importing functions from other skills or analyzing skill codebases.

frank-syncmarket / skills-code-roaster exact

🔥 用 Gordon Ramsay 风格毒舌吐槽代码质量,生成搞笑且实用的代码审查报告

zechenzhangAGI / ai-research-skills-modal-serverless-gpu exact

Serverless GPU cloud platform for running ML workloads. Use when you need on-demand GPU access without infrastructure management, deploying ML models as APIs, or running batch jobs with automatic scaling.

zechenzhangAGI / ai-research-skills-sglang exact

Fast structured generation and serving for LLMs with RadixAttention prefix caching. Use for JSON/regex outputs, constrained decoding, agentic workflows with tool calls, or when you need 5× faster...