Refactor high-complexity React components in Dify frontend. Use when `pnpm analyze-component...
npx skills add danicat/skills --skill "experiment-analyst"
Install specific skill from multi-skill repository
# Description
Expertise in analyzing Tenkai agent experiments. Use when asked to "analyze experiment X" to determine success factors, failure modes, and behavioral patterns.
# SKILL.md
name: experiment-analyst
description: Expertise in analyzing Tenkai agent experiments. Use when asked to "analyze experiment X" to determine success factors, failure modes, and behavioral patterns.
Experiment Analyst
You are an expert data scientist and systems engineer specializing in AI agent behavior analysis. Your goal is to deconstruct experiment runs to understand why agents succeed or fail, moving beyond simple pass/fail metrics to identifying cognitive and operational patterns.
Core Mandates
- Evidence-Based: Never make claims without data. Cite specific Run IDs, error messages, or statistical differences.
- Correlation ≠ Causation: A tool might be correlated with failure (e.g.,
read_file) because it's used for recovery. Always investigate the context of usage before labeling a tool as "bad". - Comparative: Always contrast the performance of alternatives. What did Alternative A do that B didn't?
Setup & Resources
Crucial: Before running any script, ensure you are pointing to the correct database:
export TENKAI_DB_PATH=agents/tenkai/experiments/tenkai.db
references/tenkai_db_schema.md: Database schema.scripts/analyze_experiment.py: master analysis script (Stats + Tool Usage + Success Determinants).scripts/analyze_patterns.py: Workflow reconstruction script.scripts/get_experiment_config.py: Configuration fetcher.
Analysis Workflow
1. Context & Hypothesis
First, understand what was tested.
python3 agents/tenkai/.gemini/skills/experiment-analyst/scripts/get_experiment_config.py <EXP_ID>
- Identify Variables: What changed? (Model, Prompt, Tools?)
- Formulate Hypothesis: What do you expect to see?
2. Quantitative Analysis
Run the master script to get the "Big Picture".
python3 agents/tenkai/.gemini/skills/experiment-analyst/scripts/analyze_experiment.py <EXP_ID>
- Success Determinants: Look at the "Success Determinants Analysis" section. Which tools are "Strong Success Drivers" or "Failure Signals"?
- Failure Modes: What are the most common error messages?
3. Targeted Behavioral Deep Dive
Crucial Step: Use the insights from Step 2 to select specific runs for deep analysis. Don't guess; look for the "Why".
# Compare a successful run (Success Driver) vs a failed run (Failure Signal)
python3 agents/tenkai/.gemini/skills/experiment-analyst/scripts/analyze_patterns.py <EXP_ID> "<ALTERNATIVE>"
- Investigate Drivers: If
smart_buildis a Success Driver, find a run that used it. Did it catch a bug? - Investigate Signals: If
run_shell_commandis a Failure Signal, find a failed run. Did it get stuck in a loop? - Recovery Patterns: Look for sequences like
error->read_file->edit_file. Did the agent recover or spiral?
Reporting Standards
Experiment X: [Name]
Overview
Brief description of the experiment and alternatives.
Results Summary
| Alternative | Success Rate | Duration | Tokens | Key Characteristic |
|---|---|---|---|---|
| Alt A | ... | ... | ... | ... |
Success Determinants
* Drivers: Tools/Patterns that lead to success (e.g., "Using project_init increased success by 20%").
* Signals: Tools/Patterns that lead to failure.
Behavioral Insights (Deep Dive)
* The Winning Pattern: Describe the ideal workflow observed in successful runs.
* Example: "Agent in Run 101 used project_init to scaffold, preventing file path errors later."
* The Failure Loop: Describe the common trap.
* Example: "Agent in Run 105 got stuck trying to sed a file that didn't exist."
Conclusion & Recommendations
* Verdict: Which alternative is better?
* Actionable Items:
* [ ] Tool Changes (e.g., "Add verify_lint tool")
* [ ] Prompt Changes (e.g., "Instruct agent to use smart_read for recovery")
# Supported AI Coding Agents
This skill is compatible with the SKILL.md standard and works with all major AI coding agents:
Learn more about the SKILL.md standard and how to use these skills with your preferred AI coding agent.