Build production-ready monitoring, logging, and tracing systems.
Observability guidelines for distributed systems using OpenTelemetry, tracing, metrics, and structured logging
OpenTelemetry observability - use for distributed tracing, metrics, instrumentation, Sentry integration, and monitoring
Implement request logging, tracing, and observability. Use for debugging, monitoring, and production observability.
Observability patterns for Python applications. Triggers on: logging, metrics, tracing, opentelemetry, prometheus, observability, monitoring, structlog, correlation id.
Senior Site Reliability Engineer & Debug Architect. Expert in AI-assisted observability, distributed tracing, and autonomous incident remediation in 2026.
Systematic debugging methodology with 4-phase process, root cause tracing, and elite observability standards. No fixes without investigation.
Operate and debug Aquaria Cloudflare Worker + Workflows deployment. Deploy safely run health checks inspect traces restart stuck instances apply fixes.
Use when setting up monitoring systems, logging, metrics, tracing, or alerting. Invoke for dashboards, Prometheus/Grafana, load testing, profiling, capacity planning.
Use when setting up monitoring systems, logging, metrics, tracing, or alerting. Invoke for dashboards, Prometheus/Grafana, load testing, profiling, capacity planning.
Create effective debugging prompts—include error messages, stack traces, expected vs actual behavior, logs, and attempted solutions
Datadog CLI for searching logs, querying metrics, tracing requests, and managing dashboards. Use this when debugging production issues or working with Datadog observability.
Use when building cloud-native apps. Keywords: kubernetes, k8s, docker, container, grpc, tonic, microservice, service mesh, observability, tracing, metrics, health check, cloud, deployment, 云原生, 微服务, 容器
Observability and SRE expert. Use when setting up monitoring, logging, tracing, defining SLOs, or managing incidents. Covers Prometheus, Grafana, OpenTelemetry, and incident response best practices.
Implement comprehensive observability for service meshes including distributed tracing, metrics, and visualization. Use when setting up mesh monitoring, debugging latency issues, or implementing...
Implement comprehensive observability for service meshes including distributed tracing, metrics, and visualization. Use when setting up mesh monitoring, debugging latency issues, or implementing...
Implement comprehensive observability for service meshes including distributed tracing, metrics, and visualization. Use when setting up mesh monitoring, debugging latency issues, or implementing...
Analyze implementation details, trace data flow, and explain technical workings with precise file:line references. Use when you need to understand HOW code works.
Diagnose and fix bugs using runtime execution traces. Use when debugging errors, analyzing failures, or finding root causes in Python, Node.js, or Java applications.
Use when investigating errors, analyzing stack traces, or finding root causes of unexpected behavior. Invoke for error investigation, troubleshooting, log analysis, root cause analysis.