7111 results (49.2ms) page 35 / 356
whop 0.00
raintree-technology / claude-starter-whop exact

Whop platform expert for digital products, memberships, and community monetization. Covers memberships API, payments, courses, forums, webhooks, OAuth apps, and checkout integration. Build SaaS,...

gptq 0.00
zechenzhangAGI / ai-research-skills-gptq exact

Post-training 4-bit quantization for LLMs with minimal accuracy loss. Use for deploying large models (70B, 405B) on consumer GPUs, when you need 4Γ— memory reduction with <2% perplexity...

zechenzhangAGI / ai-research-skills-sentencepiece exact

Language-independent tokenizer treating text as raw Unicode. Supports BPE and Unigram algorithms. Fast (50k sentences/sec), lightweight (6MB memory), deterministic vocabulary. Used by T5, ALBERT,...

zechenzhangAGI / ai-research-skills-qdrant-vector-search exact

High-performance vector similarity search engine for RAG and semantic search. Use when building production RAG systems requiring fast nearest neighbor search, hybrid search with filtering, or...

zechenzhangAGI / ai-research-skills-slime-rl-training exact

Provides guidance for LLM post-training with RL using slime, a Megatron+SGLang framework. Use when training GLM models, implementing custom data generation workflows, or needing tight Megatron-LM...

zechenzhangAGI / ai-research-skills-torchforge-rl-training exact

Provides guidance for PyTorch-native agentic RL using torchforge, Meta's library separating infra from algorithms. Use when you want clean RL abstractions, easy algorithm experimentation, or...

zechenzhangAGI / ai-research-skills-model-merging exact

Merge multiple fine-tuned models using mergekit to combine capabilities without retraining. Use when creating specialized models by blending domain-specific expertise (math + coding + chat),...

zechenzhangAGI / ai-research-skills-moe-training exact

Train Mixture of Experts (MoE) models using DeepSpeed or HuggingFace. Use when training large-scale models with limited compute (5Γ— cost reduction vs dense models), implementing sparse...

zechenzhangAGI / ai-research-skills-unsloth exact

Expert guidance for fast fine-tuning with Unsloth - 2-5x faster training, 50-80% less memory, LoRA/QLoRA optimization

zechenzhangAGI / ai-research-skills-quantizing-models-bitsandbytes exact

Quantizes LLMs to 8-bit or 4-bit for 50-75% memory reduction with minimal accuracy loss. Use when GPU memory is limited, need to fit larger models, or want faster inference. Supports INT8, NF4,...

aptos 0.00
raintree-technology / claude-starter-aptos exact

Aptos blockchain and Move language expert. Covers Move programming (abilities, generics, resources), Aptos framework modules, smart contract development, token standards (Coin, Fungible Asset,...

zechenzhangAGI / ai-research-skills-nnsight-remote-interpretability exact

Provides guidance for interpreting and manipulating neural network internals using nnsight with optional NDIF remote execution. Use when needing to run interpretability experiments on massive...

zechenzhangAGI / ai-research-skills-axolotl exact

Expert guidance for fine-tuning LLMs with Axolotl - YAML configs, 100+ models, LoRA/QLoRA, DPO/KTO/ORPO/GRPO, multimodal support

zechenzhangAGI / ai-research-skills-optimizing-attention-flash exact

Optimizes transformer attention with Flash Attention for 2-4x speedup and 10-20x memory reduction. Use when training/running transformers with long sequences (>512 tokens), encountering GPU memory...

zechenzhangAGI / ai-research-skills-model-pruning exact

Reduce LLM size and accelerate inference using pruning techniques like Wanda and SparseGPT. Use when compressing models without retraining, achieving 50% sparsity with minimal accuracy loss, or...

zechenzhangAGI / ai-research-skills-pinecone exact

Managed vector database for production AI applications. Fully managed, auto-scaling, with hybrid search (dense + sparse), metadata filtering, and namespaces. Low latency (<100ms p95). Use for...

zechenzhangAGI / ai-research-skills-nemo-guardrails exact

NVIDIA's runtime safety framework for LLM applications. Features jailbreak detection, input/output validation, fact-checking, hallucination detection, PII filtering, toxicity detection. Uses...

zechenzhangAGI / ai-research-skills-serving-llms-vllm exact

Serves LLMs with high throughput using vLLM's PagedAttention and continuous batching. Use when deploying production LLM APIs, optimizing inference latency/throughput, or serving models with...