Runs LLM inference on CPU, Apple Silicon, and consumer GPUs without NVIDIA hardware. Use for edge deployment, M1/M2/M3 Macs, AMD/Intel GPUs, or when CUDA is unavailable. Supports GGUF quantization...
Half-Quadratic Quantization for LLMs without calibration data. Use when quantizing models to 4/3/2-bit precision without needing calibration datasets, for fast quantization workflows, or when...
Quantizes LLMs to 8-bit or 4-bit for 50-75% memory reduction with minimal accuracy loss. Use when GPU memory is limited, need to fit larger models, or want faster inference. Supports INT8, NF4,...
Simplest distributed training API. 4 lines to add distributed support to any PyTorch script. Unified API for DeepSpeed/FSDP/Megatron/DDP. Automatic device placement, mixed precision...
Activation-aware weight quantization for 4-bit LLM compression with 3x speedup and minimal accuracy loss. Use when deploying large models (7B-70B) on limited GPU memory, when you need faster...
Open-source embedding database for AI applications. Store embeddings and metadata, perform vector and full-text search, filter by metadata. Simple 4-function API. Scales from notebooks to...
Post-training 4-bit quantization for LLMs with minimal accuracy loss. Use for deploying large models (70B, 405B) on consumer GPUs, when you need 4Γ memory reduction with <2% perplexity...
Implement Algorand ARC standards in smart contracts and clients. Use when needing to understand ARC-4 ABI encoding or method signatures, working with application specifications (ARC-32 or ARC-56),...
Simple Preference Optimization for LLM alignment. Reference-free alternative to DPO with better performance (+6.4 points on AlpacaEval 2.0). No reference model needed, more efficient than DPO. Use...
Translate EPUB (.epub) ebooks to another language by unpacking the EPUB container, extracting XHTML block-level fragments into JSONL translation units, translating them via the OpenAI Responses...
Expert guidance for fine-tuning LLMs with LLaMA-Factory - WebUI no-code, 100+ models, 2/3/4/5/6/8-bit QLoRA, multimodal support
This skill should be used when the user asks to "diagnose context problems", "fix lost-in-middle issues", "debug agent failures", "understand context poisoning", or mentions context degradation,...
Comprehensive guide to using LLMs throughout the game development lifecycle - from design to implementation to testingUse when "ai game development, llm game dev, claude game, gpt game, ai coding...
Expert guidance for Helm chart development, testing, security, OCI registries, and Helm 4 features. Use when creating charts, implementing tests, signing charts, working with OCI registries, or...
Comprehensive guide for Zod 4 schema validation library. This skill should be used when migrating from Zod 3, learning Zod 4 idioms, or building new validation schemas. Covers breaking changes,...
# Liberty House Club **A Parallel Binance Chain to Enable Smart Contracts** _NOTE: This document is under development. Please check regularly for updates!_ ## Table of Contents -...
Svelte 5 syntax reference. Use when writing ANY Svelte component. Svelte 5 uses runes ($state, $derived, $effect, $props) instead of Svelte 4 patterns. Training data is heavily Svelte 4βthis skill...
Skip to content github / docs Code Issues 80 Pull requests 35 Discussions Actions Projects 2 Security Insights Merge branch 'main' into 1862-Add-Travis-CI-migration-table ...
Fast workflow for small changes, bug fixes, and UI tweaks that don't require full feature development. Uses sub-agent orchestration with model selection (Sonnet 4.5 orchestrator, Haiku 4.5...
Orchestrates BMAD workflows for structured AI-driven development. Use when initializing BMAD in projects, checking workflow status, or routing between 4 phases (Analysis, Planning, Solutioning,...