Refactor high-complexity React components in Dify frontend. Use when `pnpm analyze-component...
npx skills add dbt-labs/dbt-agent-skills --skill "building-dbt-semantic-layer"
Install specific skill from multi-skill repository
# Description
Use when creating or modifying dbt Semantic Layer components including semantic models, metrics, and dimensions leveraging MetricFlow.
# SKILL.md
name: building-dbt-semantic-layer
description: Use when creating or modifying dbt Semantic Layer components including semantic models, metrics, and dimensions leveraging MetricFlow.
user-invocable: false
metadata:
author: dbt-labs
Building the dbt Semantic Layer
This skill guides the creation and modification of dbt Semantic Layer components: semantic models, entities, dimensions, and metrics.
- Semantic models - Metadata configurations that define how dbt models map to business concepts
- Entities - Keys that identify the grain of your data and enable joins between semantic models
- Dimensions - Attributes used to filter or group metrics (categorical or time-based)
- Metrics - Business calculations defined on top of semantic models (e.g., revenue, order count)
Additional Resources
- Time Spine Setup - Required for time-based metrics and aggregations
- Best Practices - Design patterns and recommendations for semantic models and metrics
[!NOTE]
This skill contains guidance for the new dbt semantic layer YAML spec, valid for dbt 1.12.0 and above. If the user is using a different version of dbt, you can use the migration guide to help them migrate to the new spec and add new components to their semantic layer. Ask the user if they want to migrate to the new spec before proceeding.
Entry Points
Users may ask questions related to building metrics with the semantic layer in a few different ways. Here are the common entry points to look out for:
Business Question First
When the user describes a metric or analysis need (e.g., "I need to track customer lifetime value by segment"):
- Search project models or existing semantic models by name, description, and column names for relevant candidates
- Present top matches with brief context (model name, description, key columns)
- User confirms which model(s) / semantic models to build on / extend / update
- Work backwards from users need to define entities, dimensions, and metrics
Model First
When the user specifies a model to expose (e.g., "Add semantic layer to customers model"):
- Read the model SQL and existing YAML config
- Identify the grain (primary key / entity)
- Suggest dimensions based on column types and names
- Ask what metrics the user wants to define
Both paths converge on the same implementation workflow.
Open Ended
User asks to build the semantic layer for a project or models that are not specified. ("Build the semantic layer for my project")
- Identify high importance models in the project
- Suggest some metrics and dimensions for those models
- Ask the user if they want to create more metrics and dimensions or if there are any other models they want to build the semantic layer on
Implementation Workflow
Step 1: Enable Semantic Model
Decide which dbt model to build the semantic model on. Add semantic_model: block to the model's YAML with enabled: true. Set agg_time_dimension to the primary time column. If the model does not have a time column, warn user that the model cannot contain metrics that are time-based. Ask the user if they want to create a derived time dimension.
Example YAML:
models:
- name: orders
semantic_model:
enabled: true # enable the semantic model
agg_time_dimension: ordered_at # set the primary time column (this is a column in the dbt model)
Step 2: Define Entities
Identify the primary key column (check for _id suffix, uniqueness tests, or explicit config). Add entity: \n\t type: primary block to that column's entry. If the model has foreign keys, define those as entity: type: foreign.
models:
- name: orders
semantic_model:
enabled: true # enable the semantic model
agg_time_dimension: ordered_at # set the primary time column (this is a column in the dbt model)
columns:
- name: order_id # this is the primary key column of the model
entity:
type: primary
name: order
- name: customer_id # this is a foreign key column of the model
entity:
type: foreign
name: customer
Step 3: Define Dimensions
Scan columns for dimension candidates. These would be useful columns to group by when querying a metrics:
- Time columns → dimension: type: time with appropriate granularity (set at the column level)
- Categorical columns (strings, booleans) → dimension: type: categorical
Present suggested dimensions to user for confirmation.
Example YAML:
models:
- name: orders
semantic_model:
enabled: true # enable the semantic model
agg_time_dimension: ordered_at # set the primary time column (this is a column in the dbt model)
columns:
- name: order_id
entity:
type: primary
name: order
- name: customer_id
entity:
type: foreign
name: customer
- name: ordered_at
granularity: day # set the granularity of the time column
dimension:
type: time
- name: order_status
dimension:
type: categorical
Step 4: Define Metrics
Create some simple metrics for the model. For each metric, collect: name, description, label, aggregation type, and expression. Support metric types: simple, derived, cumulative, conversion, ratio.
models:
- name: orders
semantic_model:
enabled: true # enable the semantic model
agg_time_dimension: ordered_at # set the primary time column (this is a column in the dbt model)
columns:
- name: order_id
entity:
type: primary
name: order
- name: customer_id
entity:
type: foreign
name: customer
- name: ordered_at
granularity: day # set the granularity of the time column
dimension:
type: time
- name: order_status
dimension:
type: categorical
metrics:
- name: order_count
type: simple
agg: count
expr: 1
- name: total_revenue
type: simple
agg: sum
expr: amount
- name: average_order_value
type: simple
agg: avg
expr: amount
Validation
After writing YAML, validate in two stages:
- Parse Validation: Run
dbt parseto confirm YAML syntax and references - Semantic Layer Validation:
dbt sl validate(dbt Cloud CLI / dbt Fusion)mf validate-configs(MetricFlow CLI)
Do not consider work complete until both validations pass.
Editing Existing Components
When modifying existing semantic layer config:
- Check if the model's YAML already has
semantic_model:block - Read existing entities, dimensions, and metrics before making changes
- Preserve all existing YAML content not being modified
- After edits, run full validation to ensure nothing broke
YAML Format Reference
Derived Dimensions and Entities
if the user wants to create a derived dimension or entity that is not a column within the dbt model, then we can use the derived_semantics block.
derived_semantics:
dimensions:
- name: order_size_bucket
type: categorical
expr: case when amount > 100 then 'large' else 'small' end
label: "Order Size"
entities:
- name: order_customer_key
type: foreign
expr: "order_id || '-' || customer_id"
Advanced Metric Examples
All simple metrics are defined at the model level under the metrics key. Advanced metrics that refer to simple metrics within the same model are defined within a model's YAML entry the models.metrics key. Advanced metrics that refer to simple metrics across different models are defined at the top level under the metrics key.
Derived Metrics
- name: revenue_per_order
type: derived
description: Average revenue per order
label: Revenue per Order
expr: total_revenue / total_orders
input_metrics:
- name: total_revenue
- name: total_orders
# With offset window
- name: revenue_growth
type: derived
expr: total_revenue - revenue_last_week
input_metrics:
- name: total_revenue
- name: total_revenue
alias: revenue_last_week
offset_window: 1 week
filter: "{{ Dimension('order__status') }} = 'completed'"
Cumulative Metrics
- name: cumulative_revenue
type: cumulative
description: Running total of revenue
label: Cumulative Revenue
input_metric: total_revenue
grain_to_date: week
period_agg: first
# With window
- name: trailing_7d_revenue
type: cumulative
input_metric: total_revenue
window: 7 days
Ratio Metrics
- name: conversion_rate
type: ratio
description: Orders divided by visits
label: Conversion Rate
numerator: total_orders
denominator: total_visits
# With filters
- name: premium_conversion_rate
type: ratio
numerator:
name: total_orders
filter: "{{ Dimension('order__customer_segment') }} = 'premium'"
alias: premium_orders
denominator: total_visits
Conversion Metrics
- name: signup_to_purchase
type: conversion
description: Rate of signups converting to purchase
label: Signup to Purchase
entity: customer
calculation: conversion_rate
base_metric: signups
conversion_metric: purchases
window: 7 days
constant_properties:
- base_property: signup_channel
conversion_property: purchase_channel
Top-level Metrics (Cross-model)
# For metrics depending on multiple semantic models
metrics:
- name: cross_model_ratio
type: ratio
numerator:
name: metric_from_model_a
filter: "{{ Dimension('entity__dim') }} > 10"
denominator:
name: metric_from_model_b
config:
group: example_group
tags:
- example_tag
meta:
owner: "@someone"
Filtering Metrics
Filters can be added to simple metrics or metric inputs to advanced metrics. The format of a filters is a Jinja template that can reference entities, dimensions, and metrics, a boolean operator, and a value. Ensure the value matches the type of the column being filtered.
Examples
filter: |
{{ Entity('entity_name') }} = 'value'
filter: |
{{ Dimension('primary_entity__dimension_name') }} > 100
filter: |
{{ TimeDimension('time_dimension', 'granularity') }} > '2026-01-01'
filter: |
{{ Metric('metric_name', group_by=['entity_name']) }} > 100
Key Formatting Rules
semantic_model:block at model level withenabled: trueagg_time_dimension:at model level (not nested undersemantic_model)entity:anddimension:on columns (can use shorthand or full form)granularity:required at column level for time dimensionsmetrics:array at model level for single-model metrics- Top-level
metrics:key for cross-model metrics (derived, ratio, cumulative, conversion only)
Best Practices
- Start with entities - Identify the grain before defining dimensions or metrics
- Use shorthand where possible -
entity: primaryinstead of full nested form for simple cases - Name metrics for business users - Use clear
labelvalues for non-technical users - Keep metrics close to their data - Simple metrics on their semantic model; top-level only for cross-model
- Set appropriate granularity - Match the actual data grain (usually
day)
Common Pitfalls
| Pitfall | Fix |
|---|---|
Missing agg_time_dimension |
Every semantic model needs a default time dimension |
granularity inside dimension: block |
Must be at column level |
Both entity and dimension on same column |
A column can only be one or the other |
Simple metrics in top-level metrics: |
Top-level is only for cross-model metrics |
Using window and grain_to_date together |
Cumulative metrics can only have one |
Missing input_metrics on derived metrics |
Must list metrics used in expr |
# Supported AI Coding Agents
This skill is compatible with the SKILL.md standard and works with all major AI coding agents:
Learn more about the SKILL.md standard and how to use these skills with your preferred AI coding agent.