125 results (3.3ms) page 1 / 7
ValorVie / custom-skills-eval-harness exact

A formal evaluation framework for Claude Code sessions, implementing eval-driven development (EDD) principles.

secucon / cc-sys-eval-harness exact

Formal evaluation framework for Claude Code sessions implementing eval-driven development (EDD) principles

UrlAudit / claude-toolbox-eval-harness exact

Formal evaluation framework for Claude Code sessions implementing eval-driven development (EDD) principles

zechenzhangAGI / ai-research-skills-evaluating-llms-harness exact

Evaluates LLMs across 60+ academic benchmarks (MMLU, HumanEval, GSM8K, TruthfulQA, HellaSwag). Use when benchmarking model quality, comparing models, reporting academic results, or tracking...

ovachiever / droid-tings-evaluating-llms-harness exact

Evaluates LLMs across 60+ academic benchmarks (MMLU, HumanEval, GSM8K, TruthfulQA, HellaSwag). Use when benchmarking model quality, comparing models, reporting academic results, or tracking...

SkillsCatalog / registry-skill-installer exact

Install Agent Skills to your AI coding agent. Supports Claude Code, Goose, OpenCode, Cursor, and other harnesses.

zechenzhangAGI / ai-research-skills-nemo-evaluator-sdk exact

Evaluates LLMs across 100+ benchmarks from 18+ harnesses (MMLU, HumanEval, GSM8K, safety, VLM) with multi-backend execution. Use when needing scalable evaluation on local Docker, Slurm HPC, or...

colebanman / grabbit-skills exact

Control the Grabbit CLI to record browser interactions (HAR) and generate API workflows. Use this skill when the user wants to: (1) Automate browser actions, (2) Capture web traffic for API...

analogjs / angular-skills-angular-testing exact

Write unit and integration tests for Angular v21+ applications using Vitest or Jasmine with TestBed, component harnesses, and modern testing patterns. Use for testing components with signals,...

parcadei / continuous-claude-v3-braintrust-analyze exact

Analyze Claude Code sessions via Braintrust

parcadei / continuous-claude-v3-pint-compute exact

Unit-aware computation with Pint - convert units, dimensional analysis, unit arithmetic

parcadei / continuous-claude-v3-qlty-check exact

Code quality checks, formatting, and metrics via qlty CLI

parcadei / continuous-claude-v3-nia-docs exact

Search library documentation and code examples via Nia

parcadei / continuous-claude-v3-ast-grep-find exact

AST-based code search and refactoring via ast-grep MCP