1198 results (9.4ms) page 13 / 60
automindtechnologie-jpg / ultimate-skill-md-agent-evaluation exact

Testing and benchmarking LLM agents including behavioral testing, capability assessment, reliability metrics, and production monitoring—where even top agents achieve less than 50% on real-world...

cleodin / antigravity-awesome-skills-agent-evaluation exact

Testing and benchmarking LLM agents including behavioral testing, capability assessment, reliability metrics, and production monitoring—where even top agents achieve less than 50% on real-world...

Ianfr13 / claude-code-plugins-agent-evaluation exact

Testing and benchmarking LLM agents including behavioral testing, capability assessment, reliability metrics, and production monitoring—where even top agents achieve less than 50% on real-world...

sickn33 / antigravity-awesome-skills-agent-evaluation exact

Testing and benchmarking LLM agents including behavioral testing, capability assessment, reliability metrics, and production monitoring—where even top agents achieve less than 50% on real-world...

liqiongyu / lenny-skills-plus-ai-product-strategy exact

Create an AI Product Strategy Pack (thesis, prioritized use cases, system plan, eval + learning plan, agentic safety plan, roadmap). Use for AI product strategy, LLM/agent strategy, AI roadmap,...

cosmix / loom-model-evaluation exact

Evaluates machine learning models for performance, fairness, and reliability using appropriate metrics and validation techniques. Covers training debugging, hyperparameter tuning, and production...

transilienceai / communitytools-ai-threat-testing exact

Offensive AI security testing and exploitation framework. Systematically tests LLM applications for OWASP Top 10 vulnerabilities including prompt injection, model extraction, data poisoning, and...

xenitV1 / claude-code-maestro-clean-code exact

The Foundation Skill. LLM Firewall + 2025 Security + Cross-Skill Coordination. Use for ALL code output - prevents hallucinations, enforces security, ensures quality.

mrgoonie / claudekit-skills-repomix exact

Package entire code repositories into single AI-friendly files using Repomix. Capabilities include pack codebases with customizable include/exclude patterns, generate multiple output formats (XML,...

jackspace / claudeskillz-repomix exact

Package entire code repositories into single AI-friendly files using Repomix. Capabilities include pack codebases with customizable include/exclude patterns, generate multiple output formats (XML,...

zircote / claude-repomix exact

Package entire code repositories into single AI-friendly files using Repomix. Capabilities include pack codebases with customizable include/exclude patterns, generate multiple output formats (XML,...

jackspace / claudeskillz-tooluniverse exact

Use this skill when working with scientific research tools and workflows across bioinformatics, cheminformatics, genomics, structural biology, proteomics, and drug discovery. This skill provides...

ovachiever / droid-tings-tooluniverse exact

Use this skill when working with scientific research tools and workflows across bioinformatics, cheminformatics, genomics, structural biology, proteomics, and drug discovery. This skill provides...

binhmuc / autobot-review-repomix exact

Package entire code repositories into single AI-friendly files using Repomix. Capabilities include pack codebases with customizable include/exclude patterns, generate multiple output formats (XML,...

binjuhor / shadcn-lar-repomix exact

Package entire code repositories into single AI-friendly files using Repomix. Capabilities include pack codebases with customizable include/exclude patterns, generate multiple output formats (XML,...

ngxtm / devkit-repomix exact

Package entire code repositories into single AI-friendly files using Repomix. Capabilities include pack codebases with customizable include/exclude patterns, generate multiple output formats (XML,...

softaworks / agent-toolkit-gepetto exact

Creates detailed, sectionized implementation plans through research, stakeholder interviews, and multi-LLM review. Use when planning features that need thorough pre-implementation analysis.

markpitt / claude-skills-genaiscript exact

Comprehensive expertise for working with Microsoft's GenAIScript framework - a JavaScript/TypeScript-based system for building automatable LLM prompts and AI workflows. Use when creating,...

parcadei / continuous-claude-v3-agentica-prompts exact

Write reliable prompts for Agentica/REPL agents that avoid LLM instruction ambiguity