Runs LLM inference on CPU, Apple Silicon, and consumer GPUs without NVIDIA hardware. Use for edge deployment, M1/M2/M3 Macs, AMD/Intel GPUs, or when CUDA is unavailable. Supports GGUF quantization...
Replace with description of the skill and when Claude should use it.
Expert guidance for Fully Sharded Data Parallel training with PyTorch FSDP - parameter sharding, mixed precision, CPU offloading, FSDP2
Quantizes LLMs to 8-bit or 4-bit for 50-75% memory reduction with minimal accuracy loss. Use when GPU memory is limited, need to fit larger models, or want faster inference. Supports INT8, NF4,...
智能变更日志生成器 - 自动分析Git提交历史,生成符合规范的CHANGELOG.md。支持语义化版本管理、多种输出格式、增量更新和GitHub/GitLab集成。
Run accessibility and visual design review on components. Use when reviewing UI code for WCAG compliance and design issues.
Distributed training orchestration across clusters. Scales PyTorch/TensorFlow/HuggingFace from laptop to 1000s of nodes. Built-in hyperparameter tuning with Ray Tune, fault tolerance, elastic...
深度调研的多Agent编排工作流:把一个调研目标拆成可并行子目标,用 Claude Code 非交互模式(`claude -p`)运行子进程;联网与采集优先使用已安装的 skills,其次使用 MCP 工具;用脚本聚合子结果并分章精修,最终交付"成品报告文件路径 +...
Control WezTerm terminal emulator via CLI. Manage panes, tabs, workspaces, and execute commands in running terminals.
Framework for state-of-the-art sentence, text, and image embeddings. Provides 5000+ pre-trained models for semantic similarity, clustering, and retrieval. Supports multilingual, domain-specific,...
Comprehensive backend development workflow that orchestrates expert analysis, architecture design, implementation, and deployment using the integrated toolset. Handles everything from API design...
Guides infrastructure and platform migration including cloud-to-cloud migration (AWS to GCP, Azure to AWS), Kubernetes cluster migration, CI/CD platform changes, monitoring stack migration,...
Run knip to find and remove unused files, dependencies, and exports. Use for cleaning up dead code and unused dependencies.
Extract structured data from LLM responses with Pydantic validation, retry failed extractions automatically, parse complex JSON with type safety, and stream partial results with Instructor -...
Provides guidance for mechanistic interpretability research using TransformerLens to inspect and manipulate transformer internals via HookPoints and activation caching. Use when...
Process and analyze massive documents (PDF, Word, HTML, JSON, Markdown) that exceed context limits. Use when a document is too large to fit in context, when you need to search large files, find...
Conducts comprehensive backend code reviews including API design (REST/GraphQL/gRPC), database patterns, authentication/authorization, caching strategies, message queues, microservices...
Guides systematic code refactoring to improve code quality, maintainability, and design. Identifies code smells, applies refactoring patterns, ensures test coverage, and follows safe refactoring...
Optimizes transformer attention with Flash Attention for 2-4x speedup and 10-20x memory reduction. Use when training/running transformers with long sequences (>512 tokens), encountering GPU memory...
Analyzes legacy RPG (Report Program Generator) programs from AS/400 and IBM i systems for migration to modern Java applications. Extracts business logic from RPG III/IV/ILE source code, identifies...