Use when you have a written implementation plan to execute in a separate session with review checkpoints
npx skills add vishnujayvel/transcription-analyzer
Or install specific skill: npx add-skill https://github.com/vishnujayvel/transcription-analyzer
# Description
>
# SKILL.md
name: transcription-analyzer
description: >
Analyzes mock interview transcripts using multi-agent architecture with 4 parallel
analyst agents (Strengths, Mistakes, Behavioral, Factual) to produce confidence-scored
insights across 10 categories. Features anti-hallucination protocol requiring evidence
citation for every claim. Use when reviewing mock interviews, wanting interview feedback
analysis, or saying "analyze my transcript" or "mock review".
license: MIT
metadata:
author: vishnu-jayavel
version: "1.0"
categories: interview-prep, analysis, multi-agent
Transcription Analyzer
Analyze mock interview transcripts with comprehensive, confidence-scored analytics across 10 categories using multi-agent architecture.
Triggers
- "analyze my transcript"
- "transcription-analyzer"
- "mock review"
- "review my transcript"
Anti-Hallucination Protocol (MANDATORY)
Every metric and insight MUST include confidence scoring and evidence citation.
Confidence Levels
| Level | Score | Criteria |
|---|---|---|
| HIGH | 90%+ | Direct quote from transcript, explicit statement |
| MEDIUM | 60-89% | Inferred from context, multiple supporting signals |
| LOW | 30-59% | Single weak signal, ambiguous evidence |
| NOT_FOUND | 0% | No evidence in transcript - explicitly state this |
Rules (Non-Negotiable)
- Never fabricate - If not in transcript, output "Not found in transcript"
- Cite evidence - Every claim needs line number or direct quote
- Distinguish inference from fact - Mark clearly:
[INFERRED]vs[EXPLICIT] - Aggregate confidence - Overall score = weighted average of components
See references/confidence-scoring.md for detailed methodology.
Step 1: Input Handling
If ARGUMENTS provided:
Use the provided file path directly.
If NO ARGUMENTS:
Ask user for the transcript file path.
Step 2: File Validation
- Load the transcript file
- Validate the file exists and contains content
- Count total lines for delegation decision
Error Handling:
If file not found:
Could not find transcript at: [attempted_path]
Please check the file path is correct.
If file is empty:
The transcript file appears to be empty.
Please provide a transcript with interview content.
Step 3: Interview Start Detection
Scan the transcript for trigger phrases that indicate when the actual interview begins (skip small talk):
| Trigger Phrase | Context |
|---|---|
| "go design" | System design prompt |
| "let's get started" | Formal interview start |
| "the problem is" | Coding problem introduction |
| "design a system" | System design prompt |
| "let's dive into" | Technical start |
| "first question" | Interview structure cue |
| "walk me through" | Technical prompt |
Record:
- Line number where interview starts
- If no trigger found: analyze from beginning, flag LOW confidence on timing
Step 4: Interview Type Detection
Classify the interview type based on content signals:
System Design Signals
- "design a system", "scalability", "database schema"
- "high availability", "load balancer", "microservices"
- "CAP theorem", "partitioning", "replication"
Coding Signals
- "write a function", "time complexity", "space complexity"
- "test cases", "edge cases", "optimal solution"
- "brute force", "algorithm", "data structure"
Behavioral Signals
- "tell me about a time", "leadership", "conflict"
- "difficult situation", "disagree with", "mentor"
- "STAR format", "situation", "action", "result"
Output:
- Interview type with confidence level and evidence
- If unclear: "Unknown" with NOT_FOUND confidence
Step 5: Optional Diagram Analysis (System Design Only)
IF interview type is "System Design":
Ask user if they have an architecture diagram to analyze alongside the transcript.
IF diagram provided:
Analyze:
- Components identified (services, databases, caches, queues)
- Data flow clarity (request paths, async flows)
- Missing components vs. verbal description
- Naming quality
- Diagram Quality Score (1-10)
IF no diagram:
Note that no diagram was provided and recommend saving diagrams for future review.
Step 6: Multi-Perspective Agent Analysis
IMPORTANT: Use parallel agents for comprehensive, bias-reduced analysis.
Launch 4 parallel agents to analyze from different perspectives. This prevents single-viewpoint blind spots.
Agent 1: Strengths Analyst
Find everything positive:
- Explicit praise from interviewer ("good", "nice", "I like that")
- Demonstrated competencies
- Strong moments and recoveries
- Communication wins
Output: {"positives": [{"title": "", "evidence": "", "confidence": "", "category": ""}]}
Agent 2: Mistakes Analyst
Find errors and problems:
- Technical errors corrected by interviewer
- Conceptual misunderstandings
- Communication issues (filler words, long pauses)
- Missed opportunities
Severity levels: CRITICAL (interview-ending), HIGH, MEDIUM, LOW
Output: {"mistakes": [{"title": "", "severity": "", "evidence": "", "confidence": "", "category": ""}]}
Agent 3: Behavioral Analyst
Assess Staff+ signals:
- Leadership presence (drove vs followed conversation)
- Trade-off articulation (made decisions, defended them)
- Depth of technical discussion
- Response to pushback/challenges
- Communication maturity
Output: {"behavioral": {"leadership": {...}, "tradeoffs": {...}, "depth_areas": [], "pushback_handling": {...}}}
Agent 4: Factual Verifier
Check technical accuracy:
- CORRECT: Technically accurate
- WRONG: Incorrect (cite the correction from transcript)
- NEEDS_VERIFICATION: Cannot determine from transcript alone
Only mark WRONG if interviewer explicitly corrected it.
Output: {"claims": [{"claim": "", "classification": "", "correction": "", "confidence": ""}]}
Synthesis
After all 4 agents return, cross-validate:
- If Strengths Agent found a positive but Mistakes Agent found related error β note the recovery
- If Behavioral Agent found leadership but Factual Agent found errors β assess net impact
- Resolve conflicts by citing evidence from both perspectives
Step 7: 10-Category Analysis
For each category, extract insights with confidence scoring and evidence citation.
Category 1: Scorecard
- Overall performance (1-10 scale) - Look for explicit feedback
- Level assessment (Junior/Mid/Senior/Staff+) - Look for explicit statements or infer
- Dimensions: Communication, Technical Depth, Structure, Leadership
- Readiness % =
100 - (P0_gaps Γ 15) - (P1_gaps Γ 5) - (CRITICAL_mistakes Γ 20) - (HIGH_mistakes Γ 10) - (MEDIUM_mistakes Γ 3)
Category 2: Time Breakdown
- Total interview duration
- Phase timings: Requirements, High-Level Design, Deep Dives, Q&A
- Time-related feedback from interviewer
Category 3: Communication Signals
- Talk ratio (candidate vs interviewer)
- Long pauses, filler words (um, uh, like, you know, basically)
- Clarifying questions asked
- Course corrections after feedback
Category 4: Mistakes Identified
For EACH mistake:
- Title and description
- Severity: CRITICAL, HIGH, MEDIUM, LOW
- Category: Fundamentals, API Design, Patterns, Domain Knowledge, Communication
- Direct evidence with line number
Category 5: Things That Went Well
- Explicit praise
- Demonstrated strengths
- Approaches that worked
Category 6: Knowledge Gaps
For EACH gap:
- Area/topic
- Category: Fundamentals, API Design, Patterns, Domain
- Priority: P0 (must fix), P1 (important), P2 (nice to have)
Category 7: Behavioral Assessment (Staff+ Signals)
- Leadership presence
- Trade-off discussions
- Depth areas
- Handling pushback
Category 8: Factual Claims
For EACH technical claim:
- The claim
- Classification: Correct, Wrong, Needs Verification
- Correction if wrong
Category 9: Action Items
- Explicit recommendations from interviewer
- Resources recommended
Category 10: Interviewer Quality
- Feedback actionability (1-5 scale)
- Specific examples given (count)
- Teaching moments
Step 8: Output Formatting
IMPORTANT: Show positives BEFORE mistakes (motivation-friendly ordering)
Structure the report as:
1. Metadata (file, type, confidence)
2. Scorecard
3. Time Breakdown
4. Communication Signals
5. Things That Went Well (before mistakes!)
6. Mistakes Identified
7. Knowledge Gaps
8. Behavioral Assessment
9. Factual Accuracy Check
10. Action Items
11. Interviewer Quality
12. Confidence Summary
Include tables with evidence citations and confidence levels for each item.
Step 9: JSON Summary
After the markdown report, output a structured JSON summary with all categories for programmatic consumption.
Assets
- assets/sample_transcript.md - Example transcript for testing
- assets/sample_output.md - Example analysis output
References
- references/analyzer-prompt.md - Portable prompt for any LLM
- references/confidence-scoring.md - Confidence methodology
# README.md
Transcription Analyzer
Multi-agent mock interview transcript analysis with confidence-scored, evidence-backed insights across 10 categories.
Built on the Agent Skills open standard - works with Claude Code, Cursor, Gemini CLI, OpenAI Codex, VS Code, and 20+ other AI coding tools.
Supported Platforms
This skill follows the Agent Skills specification and works with:
| Platform | Status |
|---|---|
| Claude Code | β |
| Claude.ai | β |
| Cursor | β |
| VS Code (Copilot) | β |
| Gemini CLI | β |
| OpenAI Codex | β |
| Roo Code | β |
| Goose | β |
| Amp | β |
| See all 26+ tools | β |
Architecture
flowchart TB
subgraph Input
T[Transcript File]
end
subgraph Director["Director Agent"]
V[Validate & Detect Type]
S[Synthesize Results]
end
subgraph Analysts["Parallel Analyst Agents"]
A1[Strengths Agent]
A2[Mistakes Agent]
A3[Behavioral Agent]
A4[Factual Agent]
end
subgraph Output
R[Unified Report]
J[JSON Summary]
end
T --> V
V --> A1 & A2 & A3 & A4
A1 --> S
A2 --> S
A3 --> S
A4 --> S
S --> R & J
Why Multi-Agent?
Single-agent analysis suffers from perspective bias - once an LLM forms an initial impression, it tends to confirm it. Our multi-agent approach:
| Agent | Perspective | Prevents |
|---|---|---|
| Strengths Agent | Optimistic - finds positives | Missing wins, underselling candidate |
| Mistakes Agent | Critical - finds errors | Glossing over problems |
| Behavioral Agent | Leadership lens - Staff+ signals | Missing seniority indicators |
| Factual Agent | Accuracy checker - verifies claims | Accepting wrong statements |
The Director synthesizes these perspectives, cross-validates conflicts, and produces a balanced report.
Installation
Claude Code / Claude.ai
# Clone and install
git clone https://github.com/vishnujayvel/transcription-analyzer.git
cp -r transcription-analyzer ~/.claude/skills/
Or via plugin marketplace (when published):
/plugin install transcription-analyzer
Cursor / VS Code
Copy the skill folder to your workspace:
git clone https://github.com/vishnujayvel/transcription-analyzer.git
cp -r transcription-analyzer .cursor/skills/
# or
cp -r transcription-analyzer .vscode/skills/
Gemini CLI
git clone https://github.com/vishnujayvel/transcription-analyzer.git
# Gemini CLI auto-discovers skills in current directory
Any Agent Skills-Compatible Tool
The skill follows the Agent Skills specification. Check your tool's documentation for skill installation.
Manual (Any LLM)
Copy references/analyzer-prompt.md, paste your transcript at the end, and send to any LLM.
Usage
# In any compatible tool
analyze my transcript
# Or with file path
transcription-analyzer /path/to/transcript.md
# Or natural language
review my mock interview
How It Works
Analysis Flow
sequenceDiagram
participant U as User
participant D as Director
participant S as Strengths Agent
participant M as Mistakes Agent
participant B as Behavioral Agent
participant F as Factual Agent
U->>D: Provide transcript
D->>D: Validate & detect interview type
par Parallel Analysis
D->>S: Find positives
D->>M: Find mistakes
D->>B: Assess Staff+ signals
D->>F: Verify technical claims
end
S-->>D: Positives JSON
M-->>D: Mistakes JSON
B-->>D: Behavioral JSON
F-->>D: Factual JSON
D->>D: Cross-validate & synthesize
D->>U: Unified 10-category report
The 10-Category Framework
mindmap
root((Transcript<br/>Analysis))
Performance
Scorecard
Time Breakdown
Communication
Talk Ratio
Filler Words
Clarifying Qs
Technical
Mistakes
Knowledge Gaps
Factual Claims
Soft Skills
Positives
Behavioral/Staff+
Outcomes
Action Items
Interviewer Quality
| # | Category | What It Measures |
|---|---|---|
| 1 | Scorecard | Overall (1-10), level assessment, readiness % |
| 2 | Time Breakdown | Phase durations, pacing |
| 3 | Communication | Talk ratio, fillers, clarifying questions |
| 4 | Mistakes | Errors by severity (CRITICAL β LOW) |
| 5 | Positives | What went well, explicit praise |
| 6 | Knowledge Gaps | Missing knowledge (P0/P1/P2 priority) |
| 7 | Behavioral | Staff+ signals: leadership, trade-offs |
| 8 | Factual Claims | Technical accuracy verification |
| 9 | Action Items | Recommendations, next steps |
| 10 | Interviewer Quality | Feedback actionability |
Anti-Hallucination Protocol
flowchart LR
subgraph Confidence Levels
H[HIGH 90%+]
M[MEDIUM 60-89%]
L[LOW 30-59%]
N[NOT_FOUND 0%]
end
subgraph Evidence Types
E[EXPLICIT<br/>Direct quote]
I[INFERRED<br/>From patterns]
end
H --- E
M --- I
L --- I
N --- |No evidence| X[State explicitly]
Rules:
1. Never fabricate - If not in transcript, say "Not found"
2. Cite everything - Line numbers or direct quotes
3. Mark inference - [INFERRED] vs [EXPLICIT]
4. Aggregate properly - Overall = weighted average
Directory Structure
Following the Agent Skills specification:
transcription-analyzer/
βββ SKILL.md # Required - skill definition with YAML frontmatter
βββ LICENSE # MIT
βββ README.md # This file
βββ references/ # Additional docs (loaded on demand)
β βββ analyzer-prompt.md # Portable prompt for any LLM
β βββ confidence-scoring.md # Confidence methodology
βββ assets/ # Static resources
βββ sample_transcript.md # Example input
βββ sample_output.md # Example output
Sample Output
From analyzing assets/sample_transcript.md (URL shortener system design mock):
Scorecard excerpt:
| Metric | Score | Confidence | Evidence |
|--------|-------|------------|----------|
| Overall | 7/10 | HIGH 95% | "solid E6 level performance" (line 194) |
| Level | E6 | HIGH 92% | [EXPLICIT] Direct statement from interviewer |
| Readiness | 78% | MEDIUM 70% | 1 HIGH mistake, 2 P1 gaps |
Top positives found:
- Back-of-envelope calculations [HIGH 98%] - "your calculations were excellent"
- Self-correction ability [HIGH 95%] - "shows good self-awareness"
- Access pattern thinking [HIGH 90%] - "I like how you're thinking about access patterns"
Key mistake identified:
- Conflated consistent hashing with DB partitioning [HIGH 92%]
- "consistent hashing...typically for caches, not database sharding" (line 190)
Multi-agent cross-validation:
- Strengths Agent found 7 positives with evidence
- Mistakes Agent found 1 HIGH, 1 MEDIUM, 1 LOW severity issue
- Factual Agent verified 2 correct claims, flagged 1 wrong
- Synthesis: Self-correction on PostgreSQL noted as positive recovery pattern
Contributing
Areas for contribution:
- [ ] Additional interview type detection (ML/AI interviews)
- [ ] Coding interview specific prompts
- [ ] Behavioral interview deep-dive
- [ ] Non-English transcript support
- [ ] Web UI for non-CLI users
Agent Skills Specification
This skill implements the Agent Skills open standard:
- SKILL.md with required YAML frontmatter (
name,description) - Progressive disclosure - metadata loaded first, full instructions on activation
- Portable - works across 26+ AI coding tools
- Self-contained - no external dependencies
Learn more: agentskills.io/specification
License
MIT License - see LICENSE
Built with the philosophy that LLM insights should be verifiable, not just plausible, and that multiple perspectives reduce bias.
# Supported AI Coding Agents
This skill is compatible with the SKILL.md standard and works with all major AI coding agents:
Learn more about the SKILL.md standard and how to use these skills with your preferred AI coding agent.