Refactor high-complexity React components in Dify frontend. Use when `pnpm analyze-component...
npx skills add langfuse/skills --skill "langfuse-observability"
Install specific skill from multi-skill repository
# Description
Instrument LLM applications with Langfuse tracing. Use when setting up Langfuse, adding observability to LLM calls, or auditing existing instrumentation.
# SKILL.md
name: langfuse-observability
description: Instrument LLM applications with Langfuse tracing. Use when setting up Langfuse, adding observability to LLM calls, or auditing existing instrumentation.
Langfuse Observability
Instrument LLM applications with Langfuse tracing, following best practices and tailored to your use case.
When to Use
- Setting up Langfuse in a new project
- Auditing existing Langfuse instrumentation
- Adding observability to LLM calls
Workflow
1. Assess Current State
Check the project:
- Is Langfuse SDK installed?
- What LLM frameworks are used? (OpenAI SDK, LangChain, LlamaIndex, Vercel AI SDK, etc.)
- Is there existing instrumentation?
No integration yet: Set up Langfuse using a framework integration if available. Integrations capture more context automatically and require less code than manual instrumentation.
Integration exists: Audit against baseline requirements below.
2. Verify Baseline Requirements
Every trace should have these fundamentals:
| Requirement | Check | Why |
|---|---|---|
| Model name | Is the LLM model captured? | Enables model comparison and filtering |
| Token usage | Are input/output tokens tracked? | Enables automatic cost calculation |
| Good trace names | Are names descriptive? (chat-response, not trace-1) |
Makes traces findable and filterable |
| Span hierarchy | Are multi-step operations nested properly? | Shows which step is slow or failing |
| Correct observation types | Are generations marked as generations? | Enables model-specific analytics |
| Sensitive data masked | Is PII/confidential data excluded or masked? | Prevents data leakage |
| Trace input/output | Does the trace capture the full data being processed as input, and the result as output? | Enables debugging and understanding what was processed |
Framework integrations (OpenAI, LangChain, etc.) handle model name, tokens, and observation types automatically. Prefer integrations over manual instrumentation.
Docs: https://langfuse.com/docs/tracing
3. Explore Traces First
Once baseline instrumentation is working, encourage the user to explore their traces in the Langfuse UI before adding more context:
"Your traces are now appearing in Langfuse. Take a look at a few of them—see what data is being captured, what's useful, and what's missing. This will help us decide what additional context to add."
This helps the user:
- Understand what they're already getting
- Form opinions about what's missing
- Ask better questions about what they need
4. Discover Additional Context Needs
Determine what additional instrumentation would be valuable. Infer from code when possible, only ask when unclear.
Infer from code:
| If you see in code... | Infer | Suggest |
|---|---|---|
| Conversation history, chat endpoints, message arrays | Multi-turn app | session_id |
User authentication, user_id variables |
User-aware app | user_id on traces |
| Multiple distinct endpoints/features | Multi-feature app | feature tag |
| Customer/tenant identifiers | Multi-tenant app | customer_id or tier tag |
| Feedback collection, ratings | Has user feedback | Capture as scores |
Only ask when not obvious from code:
- "How do you know when a response is good vs bad?" → Determines scoring approach
- "What would you want to filter by in a dashboard?" → Surfaces non-obvious tags
- "Are there different user segments you'd want to compare?" → Customer tiers, plans, etc.
Additions and their value:
| Addition | Why | Docs |
|---|---|---|
session_id |
Groups conversations together | https://langfuse.com/docs/tracing-features/sessions |
user_id |
Enables user filtering and cost attribution | https://langfuse.com/docs/tracing-features/users |
| User feedback score | Enables quality filtering and trends | https://langfuse.com/docs/scores/overview |
feature tag |
Per-feature analytics | https://langfuse.com/docs/tracing-features/tags |
customer_tier tag |
Cost/quality breakdown by segment | https://langfuse.com/docs/tracing-features/tags |
These are NOT baseline requirements—only add what's relevant based on inference or user input.
5. Guide to UI
After adding context, point users to relevant UI features:
- Traces view: See individual requests
- Sessions view: See grouped conversations (if session_id added)
- Dashboard: Build filtered views using tags
- Scores: Filter by quality metrics
Framework Integrations
Prefer these over manual instrumentation:
| Framework | Integration | Docs |
|---|---|---|
| OpenAI SDK | Drop-in replacement | https://langfuse.com/docs/integrations/openai |
| LangChain | Callback handler | https://langfuse.com/docs/integrations/langchain |
| LlamaIndex | Callback handler | https://langfuse.com/docs/integrations/llama-index |
| Vercel AI SDK | OpenTelemetry exporter | https://langfuse.com/docs/integrations/vercel-ai-sdk |
| LiteLLM | Callback or proxy | https://langfuse.com/docs/integrations/litellm |
Full list: https://langfuse.com/docs/integrations
Always Explain Why
When suggesting additions, explain the user benefit:
"I recommend adding session_id to your traces.
Why: This groups messages from the same conversation together.
You'll be able to see full conversation flows in the Sessions view,
making it much easier to debug multi-turn interactions.
Learn more: https://langfuse.com/docs/tracing-features/sessions"
Common Mistakes
| Mistake | Problem | Fix |
|---|---|---|
No flush() in scripts |
Traces never sent | Call langfuse.flush() before exit |
| Flat traces | Can't see which step failed | Use nested spans for distinct steps |
| Generic trace names | Hard to filter | Use descriptive names: chat-response, doc-summary |
| Logging sensitive data | Data leakage risk | Mask PII before tracing |
| Manual instrumentation when integration exists | More code, less context | Use framework integration |
| Langfuse import before env vars loaded | Langfuse initializes with missing/wrong credentials | Import Langfuse AFTER loading environment variables (e.g., after load_dotenv()) |
| Wrong import order with OpenAI | Langfuse can't patch the OpenAI client | Import Langfuse and call its setup BEFORE importing OpenAI client |
# Supported AI Coding Agents
This skill is compatible with the SKILL.md standard and works with all major AI coding agents:
Learn more about the SKILL.md standard and how to use these skills with your preferred AI coding agent.