Search: benchmarks | AgentSkillsRepo

llm_evaluation 0.00

vuralserhat86 / antigravity-agentic-skills-llm-evaluation exact

Implement comprehensive evaluation strategies for LLM applications using automated metrics, human feedback, and benchmarking. Use when testing LLM performance, measuring AI application quality, or...

★ 27 ai

e2e-testing 0.00

ramidamolis-alt / agent-skills-workflows-e2e-testing exact

End-to-End Testing Framework skill - Browser automation, API testing, performance benchmarking, test report generation, and chaos engineering basics. Use for comprehensive application testing.

★ 0 web

llm-evaluation 0.00

ovachiever / droid-tings-llm-evaluation exact

Implement comprehensive evaluation strategies for LLM applications using automated metrics, human feedback, and benchmarking. Use when testing LLM performance, measuring AI application quality, or...

★ 19 ai

llm-evaluation 0.00

halay08 / fullstack-agent-skills-llm-evaluation exact

Implement comprehensive evaluation strategies for LLM applications using automated metrics, human feedback, and benchmarking. Use when testing LLM performance, measuring AI application quality, or...

★ 0 ai

llm-evaluation 0.00

404kidwiz / agent-skills-backup-llm-evaluation exact

Implement comprehensive evaluation strategies for LLM applications using automated metrics, human feedback, and benchmarking. Use when testing LLM performance, measuring AI application quality, or...

★ 0 ai

llm-evaluation 0.00

rmyndharis / antigravity-skills-llm-evaluation exact

Implement comprehensive evaluation strategies for LLM applications using automated metrics, human feedback, and benchmarking. Use when testing LLM performance, measuring AI application quality, or...

★ 187 ai

m10-performance 0.00

actionbook / rust-skills-m10-performance exact

CRITICAL: Use for performance optimization. Triggers: performance, optimization, benchmark, profiling, flamegraph, criterion, slow, fast, allocation, cache, SIMD, make it faster, 性能优化, 基准测试

★ 596 tools

android-performance-profiler 0.00

HuxleyMc / android-skills-android-performance-profiler exact

Analyzes and optimizes Android app performance. Use when identifying UI jank, memory leaks, slow startup, high battery drain, or Compose recomposition issues. Covers profiling tools, benchmarks,...

★ 1 ai

agent skills android

profiling-optimization 0.00

aj-geddes / useful-ai-prompts-profiling-optimization exact

Profile application performance, identify bottlenecks, and optimize hot paths using CPU profiling, flame graphs, and benchmarking. Use when investigating performance issues or optimizing critical...

★ 55 development

skillsbench 0.00

benchflow-ai / skillsbench-skillsbench exact

SkillsBench contribution workflow. Use when: (1) Creating benchmark tasks, (2) Understanding repo structure, (3) Preparing PRs for task submission.

★ 276 productivity

performance-optimizer 0.00

ramidamolis-alt / agent-skills-workflows-performance-optimizer exact

Expert performance optimizer using ALL MCP servers. Uses MongoDB for metrics, UltraThink for analysis, Memory for benchmarks, and search MCPs for optimization techniques.

★ 0 tools

anysite-competitor-intelligence 0.00

anysiteio / agent-skills-anysite-competitor-intelligence exact

Competitive intelligence gathering using anysite MCP server across LinkedIn, social media, Y Combinator, and the web. Track competitor activities, analyze hiring patterns, monitor content...

★ 2 web

OCI Landing Zones Architecture 0.00

acedergren / oci-agent-skills-oci-landing-zones-architecture exact

Use when designing multi-tenant OCI environments, setting up production landing zones, implementing compartment hierarchies, or establishing governance foundations. Covers Landing Zone reference...

★ 1 devops

rocky-security-hardening 0.00

netsapiensis / claude-code-skills-rocky-security-hardening exact

Rocky Linux 8/9 security hardening including CIS benchmarks with OpenSCAP, SSH hardening, fail2ban, auditd rules, PAM configuration with authselect, and system-wide crypto policies. Use when...

★ 0 security

rep-performance-scorecard 0.00

onewave-ai / claude-skills-rep-performance-scorecard exact

Multi-dimensional rep evaluation: activity, conversion, velocity, deal size. Peer benchmarking and coaching priority identification.

★ 31 development

test-optimization 0.00

d-o-hub / rust-self-learning-memory-test-optimization exact

Advanced test optimization with cargo-nextest, property testing, and performance benchmarking. Use when optimizing test execution speed, implementing property-based tests, or analyzing test performance.

★ 4 development

agent-memory

competitive-intelligence-analyst 0.00

shipshitdev / library-competitive-intelligence-analyst exact

Use this skill when users need to analyze competitors, monitor market movements, benchmark features/pricing, identify market gaps, or understand competitive positioning. Activates for "what are...

★ 4 development

claude-code codex commands skills

performance-at-scale 0.00

Bbeierle12 / skill-mcp-claude-performance-at-scale exact

Spatial indexing and world streaming for Three.js building games with thousands of pieces. Use when optimizing building games, implementing spatial queries, chunk loading, or profiling...

★ 4 tools

ai-evals 0.00

liqiongyu / lenny-skills-plus-ai-evals exact

Create an AI Evals Pack (eval PRD, test set, rubric, judge plan, results + iteration loop). Use for LLM evaluation, benchmarks, rubrics, error analysis/open coding, and ship/no-ship quality gates...

★ 14 ai

agent-skills ai-agents automation claude

ux-writing 0.00

content-designer / ux-writing-skill exact

Create user-centered, accessible interface copy (microcopy) for digital products including buttons, labels, error messages, notifications, forms, onboarding, empty states, success messages, and...

★ 51 development

Confirm

Submit a Skill