Implement comprehensive evaluation strategies for LLM applications using automated metrics, human feedback, and benchmarking. Use when testing LLM performance, measuring AI application quality, or...
Testing framework
Implement comprehensive evaluation strategies for LLM applications using automated metrics, human feedback, and benchmarking. Use when testing LLM performance, measuring AI application quality, or...
Implement comprehensive evaluation strategies for LLM applications using automated metrics, human feedback, and benchmarking. Use when testing LLM performance, measuring AI application quality, or...
Implement comprehensive evaluation strategies for LLM applications using automated metrics, human feedback, and benchmarking. Use when testing LLM performance, measuring AI application quality, or...
Implement comprehensive evaluation strategies for LLM applications using automated metrics, human feedback, and benchmarking. Use when testing LLM performance, measuring AI application quality, or...
Expert digital data analytics consultant for designing and implementing data-driven growth strategies for mobile and digital applications. Use this skill when users need help with app analytics...
Simon Prince's comprehensive deep learning framework for understanding neural networks, architectures, and training.
Expert customer onboarding guidance for accelerating time-to-value and ensuring successful implementations. Use when designing onboarding programs, creating kickoff frameworks, building...
Structured reasoning for architectural decisions using First Principles Framework (Quint Code). Orchestrates ADI cycle (Abduction→Deduction→Induction→Audit→Decision) with evidence tracking and...
Build accessible, responsive, and performant frontend components with design system best practices, modern CSS, and framework-agnostic patterns.
Strategic Customer Success leadership guidance for CS org design, customer segmentation and tiering (tech touch, low touch, high touch), success metrics and KPIs (NRR, GRR, NPS, CSAT, CES),...
Trains large language models (2B-462B parameters) using NVIDIA Megatron-Core with advanced parallelism strategies. Use when training models >1B parameters, need maximum GPU efficiency (47% MFU on...
Trains large language models (2B-462B parameters) using NVIDIA Megatron-Core with advanced parallelism strategies. Use when training models >1B parameters, need maximum GPU efficiency (47% MFU on...
Comprehensive quality auditing and evaluation of tools, frameworks, and systems against industry best practices with detailed scoring across 12 critical dimensions
Use when writing tests, creating test strategies, or building automation frameworks. Invoke for unit tests, integration tests, E2E, coverage analysis, performance testing, security testing.
Use when writing tests, creating test strategies, or building automation frameworks. Invoke for unit tests, integration tests, E2E, coverage analysis, performance testing, security testing.
BMAD master coordinator who orchestrates the entire AI-powered agile development workflow across multiple specialized agents and phases
Implement proven sales methodologies (MEDDIC, BANT, Sandler, Challenger, SPIN) across your team. Generate framework-specific questions, score deals, train reps, and enforce consistent...
Comprehensive toolkit for product managers including RICE prioritization, customer interview analysis, PRD templates, discovery frameworks, and go-to-market strategies. Use for feature...