Testing and benchmarking LLM agents including behavioral testing, capability assessment, reliability metrics, and production monitoring—where even top agents achieve less than 50% on real-world...
Testing and benchmarking LLM agents including behavioral testing, capability assessment, reliability metrics, and production monitoring—where even top agents achieve less than 50% on real-world...
Browser automation powers web testing, scraping, and AI agent interactions. The difference between a flaky script and a reliable system comes down to understanding selectors, waiting strategies,...
Guided statistical analysis with test selection and reporting. Use when you need help choosing appropriate tests for your data, assumption checking, power analysis, and APA-formatted results. Best...
Advanced swarm orchestration patterns for research, development, testing, and complex distributed workflows
Browser automation for E2E testing. Use when testing user journeys, verifying UI behavior, or running end-to-end tests.
Use when you need to empirically test whether hypothesized symmetries actually hold in your data or model. Invoke when user mentions testing invariance, validating equivariance, checking if...
Generate speech from text using Google Gemini TTS models via scripts/. Use for text-to-speech, audio generation, voice synthesis, multi-speaker conversations, and creating audio content. Supports...
Claude Code için AI destekli Skill (SKILL.md) oluşturucu. React + Vite + Gemini API.
Generate images, videos, and audio with fal.ai serverless AI. Use when building AI image generation, video generation, image editing, or real-time AI features. Triggers on fal.ai, fal, AI image...
Generate images, videos, and audio with fal.ai serverless AI. Use when building AI image generation, video generation, image editing, or real-time AI features. Triggers on fal.ai, fal, AI image...
Test-Driven Development Iron Law. Write the test first. Watch it fail. Write minimal code to pass. No exceptions.
Comprehensive GitHub release orchestration with AI swarm coordination for automated versioning, testing, deployment, and rollback management
Use when you need to generate many creative options before systematically narrowing to the best choices. Invoke when exploring product ideas, solving open-ended problems, generating strategic...
Test-Driven Development (TDD) specialist enforcing write-tests-first methodology. MUST USE when: fixing bugs (버그 수정), implementing new features (기능 구현), refactoring code, '/fix-issue' invoked,...
Verify AI-generated code follows TDD discipline. Use to audit commits, check coverage quality, detect TDD anti-patterns, and generate compliance scorecards.
Vitest-specific testing utilities, mocking, and assertion patterns. Extends platform-testing with Vitest rules. Use when writing tests with Vitest.
Generate images with Google's Nano Banana Pro (Gemini 3 Pro Image). Use when generating AI images via Gemini API, creating professional visuals, or building image generation features. Triggers on...
Generate images with Google's Nano Banana Pro (Gemini 3 Pro Image). Use when generating AI images via Gemini API, creating professional visuals, or building image generation features. Triggers on...
Comprehensive Chrome DevTools automation for performance testing, Core Web Vitals measurement (INP, LCP, CLS), network monitoring, accessibility validation, responsive testing, and browser...