2192 results (43.8ms) page 18 / 110
zechenzhangAGI / ai-research-skills-hqq-quantization exact

Half-Quadratic Quantization for LLMs without calibration data. Use when quantizing models to 4/3/2-bit precision without needing calibration datasets, for fast quantization workflows, or when...

Ronitnair / research-skills-medical-imaging-review exact

Write comprehensive literature reviews for medical imaging AI research. Use when writing survey papers, systematic reviews, or literature analyses on topics like segmentation, detection,...

Makiya1202 / ai-agents-skills-vercel exact

Deploy and configure applications on Vercel. Use when deploying Next.js apps, configuring serverless functions, setting up edge functions, or managing Vercel projects. Triggers on Vercel, deploy,...

zechenzhangAGI / ai-research-skills-serving-llms-vllm exact

Serves LLMs with high throughput using vLLM's PagedAttention and continuous batching. Use when deploying production LLM APIs, optimizing inference latency/throughput, or serving models with...

zechenzhangAGI / ai-research-skills-knowledge-distillation exact

Compress large language models using knowledge distillation from teacher to student models. Use when deploying smaller models with retained performance, transferring GPT-4 capabilities to...

jsh 0.00
ikajakam / gemini-skills-jsh exact

JavaScript security auditor using jsh_helper for reconnaissance

jaykaycodes / codebase-analyzer-mcp-add-mcp-tool exact

This skill guides adding new MCP tools to the codebase-analyzer server. Use when extending the analyzer with new capabilities like new analysis types, query tools, or integrations.

zechenzhangAGI / ai-research-skills-deepspeed exact

Expert guidance for distributed training with DeepSpeed - ZeRO optimization stages, pipeline parallelism, FP16/BF16/FP8, 1-bit Adam, sparse attention

zechenzhangAGI / ai-research-skills-llama-cpp exact

Runs LLM inference on CPU, Apple Silicon, and consumer GPUs without NVIDIA hardware. Use for edge deployment, M1/M2/M3 Macs, AMD/Intel GPUs, or when CUDA is unavailable. Supports GGUF quantization...

zechenzhangAGI / ai-research-skills-mlflow exact

Track ML experiments, manage model registry with versioning, deploy models to production, and reproduce experiments with MLflow - framework-agnostic ML lifecycle platform

zechenzhangAGI / ai-research-skills-unsloth exact

Expert guidance for fast fine-tuning with Unsloth - 2-5x faster training, 50-80% less memory, LoRA/QLoRA optimization

zechenzhangAGI / ai-research-skills-model-pruning exact

Reduce LLM size and accelerate inference using pruning techniques like Wanda and SparseGPT. Use when compressing models without retraining, achieving 50% sparsity with minimal accuracy loss, or...

zechenzhangAGI / ai-research-skills-outlines exact

Guarantee valid JSON/XML/code structure during generation, use Pydantic models for type-safe outputs, support local models (Transformers, vLLM), and maximize inference speed with Outlines -...

Makiya1202 / ai-agents-skills-railway exact

Deploy applications on Railway platform. Use when deploying containerized apps, setting up databases, configuring private networking, or managing Railway projects. Triggers on Railway,...

zechenzhangAGI / ai-research-skills-qdrant-vector-search exact

High-performance vector similarity search engine for RAG and semantic search. Use when building production RAG systems requiring fast nearest neighbor search, hybrid search with filtering, or...

zechenzhangAGI / ai-research-skills-tensorrt-llm exact

Optimizes LLM inference with NVIDIA TensorRT for maximum throughput and lowest latency. Use for production deployment on NVIDIA GPUs (A100/H100), when you need 10-100x faster inference than...

zechenzhangAGI / ai-research-skills-rwkv-architecture exact

RNN+Transformer hybrid with O(n) inference. Linear time, infinite context, no KV cache. Train like GPT (parallel), infer like RNN (sequential). Linux Foundation AI project. Production at Windows,...

Makiya1202 / ai-agents-skills-owasp-security exact

Implement secure coding practices following OWASP Top 10. Use when preventing security vulnerabilities, implementing authentication, securing APIs, or conducting security reviews. Triggers on...

zechenzhangAGI / ai-research-skills-llamaindex exact

Data framework for building LLM applications with RAG. Specializes in document ingestion (300+ connectors), indexing, and querying. Features vector indices, query engines, agents, and multi-modal...

figma 0.00
Makiya1202 / ai-agents-skills-figma exact

Integrate with Figma API for design automation and code generation. Use when extracting design tokens, generating React/CSS code from Figma components, syncing design systems, building Figma...