Optimizes transformer attention with Flash Attention for 2-4x speedup and 10-20x memory reduction. Use when training/running transformers with long sequences (>512 tokens), encountering GPU memory...
Reduce LLM size and accelerate inference using pruning techniques like Wanda and SparseGPT. Use when compressing models without retraining, achieving 50% sparsity with minimal accuracy loss, or...
GitHub CLI - manage repositories, issues, pull requests, actions, releases, and more from the command line.
Analyzes legacy COBOL programs and JCL jobs to assist with migration to modern Java applications. Extracts business logic, identifies dependencies, generates migration reports, and creates Java...
Distributed training orchestration across clusters. Scales PyTorch/TensorFlow/HuggingFace from laptop to 1000s of nodes. Built-in hyperparameter tuning with Ray Tune, fault tolerance, elastic...
Provides comprehensive Google Cloud Platform (GCP) guidance including Compute Engine, Cloud Storage, Cloud SQL, BigQuery, GKE (Google Kubernetes Engine), Cloud Functions, Cloud Run, VPC...
Best practices and workflows for the Spotify Web API. Use when building apps that interact with Spotify to: (1) Handle rate limits and 429 errors, (2) Implement efficient polling strategies, (3)...
Evaluates LLMs across 100+ benchmarks from 18+ harnesses (MMLU, HumanEval, GSM8K, safety, VLM) with multi-backend execution. Use when needing scalable evaluation on local Docker, Slurm HPC, or...
Run knip to find and remove unused files, dependencies, and exports. Use for cleaning up dead code and unused dependencies.
Fine-tune LLMs using reinforcement learning with TRL - SFT for instruction tuning, DPO for preference alignment, PPO/GRPO for reward optimization, and reward model training. Use when need RLHF,...
|
High-performance RLHF framework with Ray+vLLM acceleration. Use for PPO, GRPO, RLOO, DPO training of large models (7B-70B+). Built on Ray, vLLM, ZeRO-3. 2× faster than DeepSpeedChat with...
Analyzes software bugs including root cause identification, severity assessment, impact analysis, reproduction steps validation, and fix recommendations. Performs bug triage, categorization,...
Guides comprehensive requirements gathering and analysis including stakeholder interviews, user story creation, use case documentation, acceptance criteria, requirements prioritization, and...
Agentic MCP - Three-layer progressive disclosure for MCP servers with Socket daemon. Use when the user needs to interact with MCP servers, query available tools, call MCP tools, or manage the MCP...
深度调研的多Agent编排工作流:把一个调研目标拆成可并行子目标,用 Claude Code 非交互模式(`claude -p`)运行子进程;联网与采集优先使用已安装的 skills,其次使用 MCP 工具;用脚本聚合子结果并分章精修,最终交付"成品报告文件路径 +...
Use Bun instead of Node.js, npm, pnpm, or vite. Provides command mappings, Bun-specific APIs, and development patterns.
SSH remote access patterns and utilities. Connect to servers, manage keys, tunnels, and transfers.
Conducts comprehensive architecture design reviews including system design validation, architecture pattern assessment, quality attributes evaluation, technology stack review, and scalability...
Postgres performance optimization and best practices from Supabase. Use this skill when writing, reviewing, or optimizing Postgres queries, schema designs, or database configurations.