1908 results (14.6ms) page 10 / 96
ramidamolis-alt / agent-skills-workflows-agent-evaluation exact

Testing and benchmarking LLM agents including behavioral testing, capability assessment, reliability metrics, and production monitoring—where even top agents achieve less than 50% on real-world...

404kidwiz / agent-skills-backup-agent-evaluation exact

Testing and benchmarking LLM agents including behavioral testing, capability assessment, reliability metrics, and production monitoring—where even top agents achieve less than 50% on real-world...

sickn33 / antigravity-awesome-skills-agent-evaluation exact

Testing and benchmarking LLM agents including behavioral testing, capability assessment, reliability metrics, and production monitoring—where even top agents achieve less than 50% on real-world...

Ianfr13 / claude-code-plugins-agent-evaluation exact

Testing and benchmarking LLM agents including behavioral testing, capability assessment, reliability metrics, and production monitoring—where even top agents achieve less than 50% on real-world...

halay08 / fullstack-agent-skills-agent-evaluation exact

Testing and benchmarking LLM agents including behavioral testing, capability assessment, reliability metrics, and production monitoring—where even top agents achieve less than 50% on real-world...

cleodin / antigravity-awesome-skills-agent-evaluation exact

Testing and benchmarking LLM agents including behavioral testing, capability assessment, reliability metrics, and production monitoring—where even top agents achieve less than 50% on real-world...

ncklrs / startup-os-skills-product-strategist exact

Expert product strategist for vision, strategy, and market positioning. Use when defining product vision, assessing product-market fit, sizing market opportunities (TAM/SAM/SOM), competitive...

ngxtm / devkit-broken-authentication-testing exact

This skill should be used when the user asks to "test for broken authentication vulnerabilities", "assess session management security", "perform credential stuffing tests", "evaluate password...

cleodin / antigravity-awesome-skills-broken-authentication-testing exact

This skill should be used when the user asks to "test for broken authentication vulnerabilities", "assess session management security", "perform credential stuffing tests", "evaluate password...

zebbern / claude-code-guide-broken-authentication-testing exact

This skill should be used when the user asks to "test for broken authentication vulnerabilities", "assess session management security", "perform credential stuffing tests", "evaluate password...

halay08 / fullstack-agent-skills-broken-authentication-testing exact

This skill should be used when the user asks to "test for broken authentication vulnerabilities", "assess session management security", "perform credential stuffing tests", "evaluate password...

404kidwiz / agent-skills-backup-broken-authentication-testing exact

This skill should be used when the user asks to "test for broken authentication vulnerabilities", "assess session management security", "perform credential stuffing tests", "evaluate password...

sickn33 / antigravity-awesome-skills-broken-authentication-testing exact

This skill should be used when the user asks to "test for broken authentication vulnerabilities", "assess session management security", "perform credential stuffing tests", "evaluate password...

ali5ter / claude-cli-ux-skill-cli-ux-tester exact

Expert UX evaluator for command-line interfaces, CLIs, terminal tools, shell scripts, and developer APIs. Use proactively when reviewing CLIs, testing command usability, evaluating error messages,...

onewave-ai / claude-skills-team-chemistry-evaluator exact

Analyze roster fit and personality dynamics. Leadership assessment, role clarity, locker room culture, trade/signing impact.

erichowens / some-claude-skills-photo-composition-critic exact

Expert photography composition critic grounded in graduate-level visual aesthetics education, computational aesthetics research (AVA, NIMA, LAION-Aesthetics, VisualQuality-R1), and professional...

schwepps / skills-solidity-auditor exact

Professional-grade Solidity smart contract security auditor. Performs comprehensive audits or targeted reviews (security vulnerabilities, gas optimization, storage optimization, code architecture,...

erichowens / some-claude-skills-product-appeal-analyzer exact

Evaluate product desirability, market positioning, and emotional resonance—the complement to friction analysis. Assess whether users will WANT a product (not just use it), identity fit, trust...