Runs LLM inference on CPU, Apple Silicon, and consumer GPUs without NVIDIA hardware. Use for edge deployment, M1/M2/M3 Macs, AMD/Intel GPUs, or when CUDA is unavailable. Supports GGUF quantization...
Provides comprehensive Microsoft Azure guidance including Azure Virtual Machines, Azure Storage (Blob, Files, Disks), Azure SQL Database, Azure App Service, Azure Functions, AKS (Azure Kubernetes...
Language-independent tokenizer treating text as raw Unicode. Supports BPE and Unigram algorithms. Fast (50k sentences/sec), lightweight (6MB memory), deterministic vocabulary. Used by T5, ALBERT,...
Guides comprehensive application migration projects including legacy system modernization, cloud migration, technology stack upgrades, database migration, and architecture transformation. Covers...
Iterative UI/UX polishing workflow for web applications. The exact prompt and methodology for achieving Stripe-level visual polish through multiple passes.
Create interactive, production-ready UI mockups and prototypes using NuxtJS 4 (Vue) or Next.js (React), TypeScript, and TailwindCSS v4. Use when building web mockups, prototypes, landing pages,...
Parameter-efficient fine-tuning for LLMs using LoRA, QLoRA, and 25+ methods. Use when fine-tuning large models (7B-70B) with limited GPU memory, when you need to train <1% of parameters with...
High-performance RLHF framework with Ray+vLLM acceleration. Use for PPO, GRPO, RLOO, DPO training of large models (7B-70B+). Built on Ray, vLLM, ZeRO-3. 2Γ faster than DeepSpeedChat with...
Deploy and manage Vercel projects, domains, environment variables, and serverless functions using the `vercel` CLI.
|
Generate a complete set of favicons from a source image and update HTML. Use when setting up favicons for a web project.
Manage Supabase projects, databases, migrations, Edge Functions, and storage using the `supabase` CLI.
Expert backend development guidance covering Node.js, Python, Java, Go, API design (REST/GraphQL/gRPC), database patterns, authentication, caching, message queues, microservices, and testing....
Track ML experiments, manage model registry with versioning, deploy models to production, and reproduce experiments with MLflow - framework-agnostic ML lifecycle platform
Multi-cloud orchestration for ML workloads with automatic cost optimization. Use when you need to run training or batch jobs across multiple clouds, leverage spot instances with auto-recovery, or...
Provides comprehensive Google Cloud Platform (GCP) guidance including Compute Engine, Cloud Storage, Cloud SQL, BigQuery, GKE (Google Kubernetes Engine), Cloud Functions, Cloud Run, VPC...
Provides comprehensive DevOps guidance including CI/CD pipeline design, infrastructure as code (Terraform, CloudFormation, Bicep), container orchestration (Docker, Kubernetes), deployment...
Reduce LLM size and accelerate inference using pruning techniques like Wanda and SparseGPT. Use when compressing models without retraining, achieving 50% sparsity with minimal accuracy loss, or...
Provides comprehensive AWS (Amazon Web Services) guidance including EC2, S3, RDS, Lambda, ECS/EKS, CloudFormation, API Gateway, CloudFront, cloud migration from on-premise/GCP/Azure, security...
Designs comprehensive integration testing strategies including API testing, database testing, microservices testing, end-to-end testing, and test automation frameworks. Produces test plans, test...