40 results (15.3ms) page 1 / 2
richard-gyiko / which-llm-which-llm exact

Select optimal LLM(s) for a task based on skill requirements, budget, and constraints. Uses the `which-llm` CLI to query Artificial Analysis benchmarks enriched with capability data from models.dev.

RefoundAI / lenny-skills-ai-evals exact

Help users create and run AI evaluations. Use when someone is building evals for LLM products, measuring model quality, creating test cases, designing rubrics, or trying to systematically measure...

Arize-ai / phoenix-phoenix-evals exact

Build and run evaluators for AI/LLM applications using Phoenix.

liqiongyu / lenny-skills-plus-ai-evals exact

Create an AI Evals Pack (eval PRD, test set, rubric, judge plan, results + iteration loop). Use for LLM evaluation, benchmarks, rubrics, error analysis/open coding, and ship/no-ship quality gates...

evals 0.26
adriancooney / evals-evals exact

Run and create evals for testing agent behavior. Use when the user wants to create or run an eval.

eval 0.25
mikeyobrien / ralph-orchestrator-eval exact

EvalKit is a conversational evaluation framework for AI agents that guides you through creating robust evaluations using the Strands Evals SDK. Through natural conversation, you can plan...

existential-birds / beagle-llm-judge exact

LLM-as-judge methodology for comparing code implementations across repositories. Scores implementations on functionality, security, test quality, overengineering, and dead code using weighted...

semgrep / skills-llm-security exact

Security guidelines for LLM applications based on OWASP Top 10 for LLM 2025. Use when building LLM apps, reviewing AI security, implementing RAG systems, or asking about LLM vulnerabilities like...

existential-birds / beagle-llm-artifacts-detection exact

Detects common LLM coding agent artifacts in codebases. Identifies test quality issues, dead code, over-abstraction, and verbose LLM style patterns. Use when cleaning up AI-generated code or...

omer-metin / skills-for-antigravity-unity-llm-integration exact

Integrating local and cloud LLMs into Unity games for AI NPCs, dialogue, and intelligent behaviorsUse when "unity llm, llmunity, unity ai npc, unity local llm, unity sentis llm, unity chatgpt,...

omer-metin / skills-for-antigravity-godot-llm-integration exact

Integrating local LLMs into Godot games using NobodyWho and other Godot-native solutionsUse when "godot llm, nobodywho, godot ai npc, gdscript llm, godot local llm, godot chatgpt, godot 4 ai,...

omer-metin / skills-for-antigravity-llm-architect exact

LLM application architecture expert for RAG, prompting, agents, and production AI systemsUse when "rag system, prompt engineering, llm application, ai agent, structured output, chain of thought,...

hardw00t / ai-security-arsenal-llm-security exact

LLM and AI application security testing skill for prompt injection, jailbreaking, and AI system vulnerabilities. This skill should be used when testing AI/ML applications for security issues,...

halay08 / fullstack-agent-skills-llm-evaluation exact

Implement comprehensive evaluation strategies for LLM applications using automated metrics, human feedback, and benchmarking. Use when testing LLM performance, measuring AI application quality, or...

rmyndharis / antigravity-skills-llm-evaluation exact

Implement comprehensive evaluation strategies for LLM applications using automated metrics, human feedback, and benchmarking. Use when testing LLM performance, measuring AI application quality, or...

404kidwiz / agent-skills-backup-llm-evaluation exact

Implement comprehensive evaluation strategies for LLM applications using automated metrics, human feedback, and benchmarking. Use when testing LLM performance, measuring AI application quality, or...

ovachiever / droid-tings-llm-evaluation exact

Implement comprehensive evaluation strategies for LLM applications using automated metrics, human feedback, and benchmarking. Use when testing LLM performance, measuring AI application quality, or...

phrazzld / claude-config-llm-communication exact

Write effective LLM prompts, commands, and agent instructions. Goal-oriented over step-prescriptive. Role + Objective + Latitude pattern. Use when writing prompts, designing agents, building...

jamesrochabrun / skills-llm-router exact

This skill should be used when users want to route LLM requests to different AI providers (OpenAI, Grok/xAI, Groq, DeepSeek, OpenRouter) using SwiftOpenAI-CLI. Use this skill when users ask to...