Use when adding new error messages to React, or seeing "unknown error code" warnings.
npx skills add Mindrally/skills --skill "observability-guidelines"
Install specific skill from multi-skill repository
# Description
Observability guidelines for distributed systems using OpenTelemetry, tracing, metrics, and structured logging
# SKILL.md
name: observability-guidelines
description: Observability guidelines for distributed systems using OpenTelemetry, tracing, metrics, and structured logging
Observability Guidelines
Apply these observability principles to ensure comprehensive visibility into distributed systems and microservices.
Core Observability Principles
- Guide the development of idiomatic, maintainable, and high-performance code with built-in observability
- Enforce modular design and separation of concerns through Clean Architecture
- Promote test-driven development and robust observability from the start
OpenTelemetry Integration
- Use OpenTelemetry for distributed tracing, metrics, and structured logging
- Start and propagate tracing spans across all service boundaries
- Use otel.Tracer for creating spans and otel.Meter for collecting metrics
- Export data to OpenTelemetry Collector, Jaeger, or Prometheus
- Configure appropriate sampling rates for production environments
Distributed Tracing
- Trace all incoming requests and propagate context through internal calls
- Use middleware to instrument HTTP and gRPC endpoints automatically
- Include trace context in all downstream service calls
- Create child spans for significant operations within a service
- Add relevant attributes to spans for debugging and analysis
Metrics Collection
Monitor these key metrics across all services:
- Request latency: Track p50, p90, p95, and p99 percentiles
- Throughput: Measure requests per second by endpoint
- Error rate: Track 4xx and 5xx responses separately
- Resource usage: Monitor CPU, memory, disk, and network utilization
- Custom business metrics: Track domain-specific KPIs
Structured Logging
- Include unique request IDs and trace context in all logs for correlation
- Use structured logging formats (JSON) for machine parseability
- Include relevant context: timestamp, service name, trace ID, span ID
- Log at appropriate levels: DEBUG, INFO, WARN, ERROR
- Avoid logging sensitive information (PII, credentials)
Architecture Patterns
- Apply Clean Architecture with handlers, services, repositories, and domain models
- Use domain-driven design principles for clear boundaries
- Prioritize interface-driven development with explicit dependency injection
- Prefer composition over inheritance; favor small, purpose-specific interfaces
Correlation and Context
- Propagate context through the entire request lifecycle
- Use correlation IDs for request tracking across services
- Include service version and deployment information in telemetry
- Tag traces with relevant business context for filtering
- Enable trace-to-log and log-to-trace correlation
Alerting and Dashboards
- Create dashboards for service health and business metrics
- Set up alerts based on SLOs and error budgets
- Use anomaly detection for proactive issue identification
- Document runbooks for common alert scenarios
- Review and tune alerts regularly to reduce noise
Instrumentation Best Practices
- Instrument at service boundaries (entry/exit points)
- Add custom spans for database operations and external calls
- Include relevant attributes (user ID, request type, etc.)
- Avoid over-instrumentation that creates noise
- Use semantic conventions for consistent attribute naming
Production Considerations
- Configure appropriate sampling rates to balance visibility and cost
- Use head-based sampling for consistent trace capture
- Implement tail-based sampling for capturing errors
- Set retention policies based on debugging needs
- Monitor observability infrastructure health
# Supported AI Coding Agents
This skill is compatible with the SKILL.md standard and works with all major AI coding agents:
Learn more about the SKILL.md standard and how to use these skills with your preferred AI coding agent.