Expert guidance for Fully Sharded Data Parallel training with PyTorch FSDP - parameter sharding, mixed precision, CPU offloading, FSDP2
Quantizes LLMs to 8-bit or 4-bit for 50-75% memory reduction with minimal accuracy loss. Use when GPU memory is limited, need to fit larger models, or want faster inference. Supports INT8, NF4,...
Analyze and implement purposeful UI animations for Next.js + Tailwind + React projects. Use when user asks to add animations, enhance UI motion, animate pages/components, or improve visual...
Process and analyze massive documents (PDF, Word, HTML, JSON, Markdown) that exceed context limits. Use when a document is too large to fit in context, when you need to search large files, find...
Run accessibility and visual design review on components. Use when reviewing UI code for WCAG compliance and design issues.
Analyzes and assists with Fujitsu mainframe systems including FACOM, PRIMERGY, BS2000/OSD, OSIV/MSP, OSIV/XSP, NetCOBOL, PowerCOBOL, and Fujitsu JCL. Extracts business logic from Fujitsu COBOL...
Conducts comprehensive code quality reviews including code smells detection, maintainability assessment, complexity analysis, design pattern evaluation, naming conventions, code duplication,...
Half-Quadratic Quantization for LLMs without calibration data. Use when quantizing models to 4/3/2-bit precision without needing calibration datasets, for fast quantization workflows, or when...
Use when user wants to execute long-running tasks that require multiple sessions to complete. This skill manages task decomposition, progress tracking, and autonomous execution using Claude Code...
Jeffrey Emanuel's comprehensive markdown planning methodology for software projects. The 85%+ time-on-planning approach that makes agentic coding work at scale. Includes exact prompts used.
Agentic MCP - Three-layer progressive disclosure for MCP servers with Socket daemon. Use when the user needs to interact with MCP servers, query available tools, call MCP tools, or manage the MCP...
GitHub CLI - manage repositories, issues, pull requests, actions, releases, and more from the command line.
Creates high-quality technical documentation including API documentation, user guides, tutorials, architecture documents, README files, release notes, and technical specifications. Produces clear,...
Google Cloud Platform CLI - manage GCP resources including Compute Engine, Cloud Run, GKE, Cloud Functions, Storage, BigQuery, and more.
Framework for state-of-the-art sentence, text, and image embeddings. Provides 5000+ pre-trained models for semantic similarity, clustering, and retrieval. Supports multilingual, domain-specific,...
深度调研的多Agent编排工作流:把一个调研目标拆成可并行子目标,用 Claude Code 非交互模式(`claude -p`)运行子进程;联网与采集优先使用已安装的 skills,其次使用 MCP 工具;用脚本聚合子结果并分章精修,最终交付"成品报告文件路径 +...
Brief description of what this skill does and when to use it.
Optimizes transformer attention with Flash Attention for 2-4x speedup and 10-20x memory reduction. Use when training/running transformers with long sequences (>512 tokens), encountering GPU memory...
Educational GPT implementation in ~300 lines. Reproduces GPT-2 (124M) on OpenWebText. Clean, hackable code for learning transformers. By Andrej Karpathy. Perfect for understanding GPT architecture...
SSH remote access patterns and utilities. Connect to servers, manage keys, tunnels, and transfers.