Testing and benchmarking LLM agents including behavioral testing, capability assessment, reliability metrics, and production monitoring—where even top agents achieve less than 50% on real-world...
Testing and benchmarking LLM agents including behavioral testing, capability assessment, reliability metrics, and production monitoring—where even top agents achieve less than 50% on real-world...
Testing and benchmarking LLM agents including behavioral testing, capability assessment, reliability metrics, and production monitoring—where even top agents achieve less than 50% on real-world...
Testing and benchmarking LLM agents including behavioral testing, capability assessment, reliability metrics, and production monitoring—where even top agents achieve less than 50% on real-world...
Testing and benchmarking LLM agents including behavioral testing, capability assessment, reliability metrics, and production monitoring—where even top agents achieve less than 50% on real-world...
Testing and benchmarking LLM agents including behavioral testing, capability assessment, reliability metrics, and production monitoring—where even top agents achieve less than 50% on real-world...
Testing and benchmarking LLM agents including behavioral testing, capability assessment, reliability metrics, and production monitoring—where even top agents achieve less than 50% on real-world...
Testing and benchmarking LLM agents including behavioral testing, capability assessment, reliability metrics, and production monitoring—where even top agents achieve less than 50% on real-world...
DeFi protocol specialist for AMMs, lending protocols, yield strategies, and economic securityUse when "defi, amm, liquidity pool, lending protocol, yield farming, oracle, liquidation, tokenomics,...
NextDNS Web UI configuration and management best practices. This skill should be used when configuring NextDNS profiles via the web interface (my.nextdns.io), including security features, privacy...
World-class voiceover expertise combining the narrative craft of documentary producers, the commercial precision of advertising agencies, and the accessibility of modern AI voice technology....
World-class alternative data and sentiment analysis for trading - social media, news, on-chain data, positioning. Extract alpha from information others miss. Use when "sentiment, alternative data,...
World-class character and art style consistency for AI-generated images and videos - ensures visual coherence across series, maintains character identity, and provides rigorous QA before...
Check Munich public transport (MVG) for disruptions, strikes, and service alerts. Use when user asks about Munich transport status, MVG problems, S-Bahn/U-Bahn delays, or needs commute planning in Munich.
Help founders raise capital and build investor relationships. Use when someone is preparing a pitch deck, deciding whether to raise venture capital, meeting with investors, or asking about...
SEO fundamentals, E-E-A-T, Core Web Vitals, and Google algorithm principles.
SEO fundamentals, E-E-A-T, Core Web Vitals, and Google algorithm principles.
SEO fundamentals, E-E-A-T, Core Web Vitals, and Google algorithm principles.
Expertise in building immersive VR/AR experiences using WebXR and spatial computing principlesUse when "vr development, ar development, webxr, virtual reality, augmented reality, mixed reality,...
Converts a markdown string or file into a high-quality PNG image using markdown-to-poster and Puppeteer.