Use when integrating with Polza.ai API, writing code for AI model calls, configuring OpenAI-compatible clients with Polza.ai base URL, or when user mentions polza
Simple Preference Optimization for LLM alignment. Reference-free alternative to DPO with better performance (+6.4 points on AlpacaEval 2.0). No reference model needed, more efficient than DPO. Use...
Simple Preference Optimization for LLM alignment. Reference-free alternative to DPO with better performance (+6.4 points on AlpacaEval 2.0). No reference model needed, more efficient than DPO. Use...
Invoke the 'opencode' CLI in headless mode for AI-powered code analysis, reviews, and second opinions. Use when you need a different AI perspective, the user mentions opencode, or requests batch...
>
>
Fast structured generation and serving for LLMs with RadixAttention prefix caching. Use for JSON/regex outputs, constrained decoding, agentic workflows with tool calls, or when you need 5Γ faster...
Fast structured generation and serving for LLMs with RadixAttention prefix caching. Use for JSON/regex outputs, constrained decoding, agentic workflows with tool calls, or when you need 5Γ faster...
|
|
A specialized CPO + Technical Partner agent that helps users incubate ideas, analyze feasibility, and document specifications. Use when the user has a new product idea, technical proposal, or...
Automatically generates illustrations for articles, uploads to image hosting (ImgBB), and inserts Markdown links. Use when adding images to articles, when user requests illustration, or when...
Design AI loading, thinking, and progress indicator UX. Use when explicitly asked to improve AI waiting states, add thinking indicators, or design loading UX for AI interfaces. Covers reasoning...
Web research with Graph-of-Thoughts for fast-changing topics. Use when user requests research, analysis, investigation, or comparison requiring current information. Features hypothesis testing,...
>
Activation-aware weight quantization for 4-bit LLM compression with 3x speedup and minimal accuracy loss. Use when deploying large models (7B-70B) on limited GPU memory, when you need faster...
Build production-ready LLM applications, advanced RAG systems, and
Build production-ready LLM applications, advanced RAG systems, and
Build production-ready LLM applications, advanced RAG systems, and
Provides guidance for training LLMs with reinforcement learning using verl (Volcano Engine RL). Use when implementing RLHF, GRPO, PPO, or other RL algorithms for LLM post-training at scale with...