performance-monitor

by @404kidwiz in AI & LLM

# Install this skill:

npx skills add 404kidwiz/claude-supercode-skills --skill "performance-monitor"

Install specific skill from multi-skill repository

# Description

Expert in observing, benchmarking, and optimizing AI agents. Specializes in token usage tracking, latency analysis, and quality evaluation metrics. Use when optimizing agent costs, measuring performance, or implementing evals. Triggers include "agent performance", "token usage", "latency optimization", "eval", "agent metrics", "cost optimization", "agent benchmarking".

# SKILL.md

name: performance-monitor
description: Expert in observing, benchmarking, and optimizing AI agents. Specializes in token usage tracking, latency analysis, and quality evaluation metrics. Use when optimizing agent costs, measuring performance, or implementing evals. Triggers include "agent performance", "token usage", "latency optimization", "eval", "agent metrics", "cost optimization", "agent benchmarking".

Performance Monitor

Purpose

Provides expertise in monitoring, benchmarking, and optimizing AI agent performance. Specializes in token usage tracking, latency analysis, cost optimization, and implementing quality evaluation metrics (evals) for AI systems.

When to Use

Tracking token usage and costs for AI agents
Measuring and optimizing agent latency
Implementing evaluation metrics (evals)
Benchmarking agent quality and accuracy
Optimizing agent cost efficiency
Building observability for AI pipelines
Analyzing agent conversation patterns
Setting up A/B testing for agents

Quick Start

Invoke this skill when:
- Optimizing AI agent costs and token usage
- Measuring agent latency and performance
- Implementing evaluation frameworks
- Building observability for AI systems
- Benchmarking agent quality

Do NOT invoke when:
- General application performance → use /performance-engineer
- Infrastructure monitoring → use /sre-engineer
- ML model training optimization → use /ml-engineer
- Prompt design → use /prompt-engineer

Decision Framework

Optimization Goal?
├── Cost Reduction
│   ├── Token usage → Prompt optimization
│   └── API calls → Caching, batching
├── Latency
│   ├── Time to first token → Streaming
│   └── Total response time → Model selection
├── Quality
│   ├── Accuracy → Evals with ground truth
│   └── Consistency → Multiple run analysis
└── Reliability
    └── Error rates, retry patterns

Core Workflows

1. Token Usage Tracking

Instrument API calls to capture usage
Track input vs output tokens separately
Aggregate by agent, task, user
Calculate costs per operation
Build dashboards for visibility
Set alerts for anomalous usage

2. Eval Framework Setup

Define evaluation criteria
Create test dataset with expected outputs
Implement scoring functions
Run automated eval pipeline
Track scores over time
Use for regression testing

3. Latency Optimization

Measure baseline latency
Identify bottlenecks (model, network, parsing)
Implement streaming where applicable
Optimize prompt length
Consider model size tradeoffs
Add caching for repeated queries

Best Practices

Track tokens separately from API call counts
Implement evals before optimizing
Use percentiles (p50, p95, p99) not averages for latency
Log prompt and response for debugging
Set cost budgets and alerts
Version prompts and track performance per version

Anti-Patterns

Anti-Pattern	Problem	Correct Approach
No token tracking	Surprise costs	Instrument all calls
Optimizing without evals	Quality regression	Measure before optimizing
Average-only latency	Hides tail latency	Use percentiles
No prompt versioning	Can't correlate changes	Version and track
Ignoring caching	Repeated costs	Cache stable responses

# Supported AI Coding Agents

This skill is compatible with the SKILL.md standard and works with all major AI coding agents:

⚡ Amp 🚀 Antigravity 🤖 Claude Code 🦀 Clawdbot 📝 Codex ▶️ Cursor 🤖 Droid 💎 Gemini CLI 🐙 GitHub Copilot 🪿 Goose 📊 Kilo Code 🔧 Kiro CLI 💻 OpenCode 🦘 Roo Code 🌲 Trae 🏄 Windsurf

Learn more about the SKILL.md standard and how to use these skills with your preferred AI coding agent.