Apply universal naming principles: avoid Manager/Helper/Util, use intention-revealing names, domain language, verb+noun functions. Use when naming variables, functions, classes, reviewing names,...
Activation-aware weight quantization for 4-bit LLM compression with 3x speedup and minimal accuracy loss. Use when deploying large models (7B-70B) on limited GPU memory, when you need faster...
Model Context Protocol (MCP) server patterns for building integrations with Claude Code. Triggers on: mcp server, model context protocol, tool handler, mcp resource, mcp tool.
GPU-optimized OCR using Surya. Use when: (1) Extracting text from images/screenshots, (2) Processing PDFs with embedded images, (3) Multi-language document OCR, (4) Layout analysis and table...
GPU-optimized OCR using Surya. Use when: (1) Extracting text from images/screenshots, (2) Processing PDFs with embedded images, (3) Multi-language document OCR, (4) Layout analysis and table...
Generate a high-level TLA+ model from source code (C, C++, Rust, etc.). Analyzes code to understand its purpose, creates abstractions, writes TLA+ specification, and proposes invariants and...
Analyze codebase with tokei (fast line counts by language) and difft (semantic AST-aware diffs). Get quick project overview without manual counting. Triggers on: how big is codebase, count lines...
Use when you need to empirically test whether hypothesized symmetries actually hold in your data or model. Invoke when user mentions testing invariance, validating equivariance, checking if...
Accelerate LLM inference using speculative decoding, Medusa multiple heads, and lookahead decoding techniques. Use when optimizing inference speed (1.5-3.6Γ speedup), reducing latency for...
Accelerate LLM inference using speculative decoding, Medusa multiple heads, and lookahead decoding techniques. Use when optimizing inference speed (1.5-3.6Γ speedup), reducing latency for...
Use when starting technical work requiring structured approach - writing tests before code (TDD), planning data exploration (EDA), designing statistical analysis, clarifying modeling objectives...
Extend context windows of transformer models using RoPE, YaRN, ALiBi, and position interpolation techniques. Use when processing long documents (32k-128k+ tokens), extending pre-trained models...
Extend context windows of transformer models using RoPE, YaRN, ALiBi, and position interpolation techniques. Use when processing long documents (32k-128k+ tokens), extending pre-trained models...
Extend context windows of transformer models using RoPE, YaRN, ALiBi, and position interpolation techniques. Use when processing long documents (32k-128k+ tokens), extending pre-trained models...
Design or redesign an org structure and operating model by producing an Organizational Design Pack (design brief, current-state map, operating-model decision, target org blueprint, transition...
Use this skill when building AI features, integrating LLMs, implementing RAG, working with embeddings, deploying ML models, or doing data science. Activates on mentions of OpenAI, Anthropic,...
Guide for Convex backend development fundamentals including function types (queries, mutations, actions), layered architecture, HTTP actions, and the core mental model. Use when building Convex...
Implement and maintain integrations with Nebius Token Factory (tokenfactory.nebius.com) OpenAI-compatible API, including auth + base URL config, chat/completions/embeddings/images, model...
Comprehensive healthcare AI toolkit for developing, testing, and deploying machine learning models with clinical data. This skill should be used when working with electronic health records (EHR),...
Comprehensive healthcare AI toolkit for developing, testing, and deploying machine learning models with clinical data. This skill should be used when working with electronic health records (EHR),...