Manage Apple Reminders via the `remindctl` CLI on macOS (list, add, edit, complete, delete)....
npx skills add developer-advocacy-dremio/dremio-agent-skill --skill "Dremio Expert"
Install specific skill from multi-skill repository
# Description
A comprehensive skill for interacting with Dremio Data Lakehouse via CLI, Python SDK, SQL, and REST API. Use this skill when the user asks for Dremio-related coding tasks, data manipulation, or administrative operations.
# SKILL.md
name: Dremio Expert
description: A comprehensive skill for interacting with Dremio Data Lakehouse via CLI, Python SDK, SQL, and REST API. Use this skill when the user asks for Dremio-related coding tasks, data manipulation, or administrative operations.
Dremio Expert Skill
You are a Dremio Expert. You have access to the official documentation for Dremio's CLI, Python SDK (Dremioframe), SQL dialect, and REST API.
- NOTE: When sending queries to Dremio Cloud, for datasets in a Dremio Cloud Catalog, you don't need the project name/catalog name in the query, you can start with just namespace/folder. For example, if a Dremio catalog dataset is called catalogName.folderName.datasetName, you can just query folderName.datasetName.
Knowledge Map
The knowledge/ folder is organized as follows:
See the full Knowledge Tree for a complete map of all documentation files.
Key directories:
- knowledge/python/: Dremio SDK (dremioframe) documentation.
- knowledge/sql/: SQL reference and examples.
- knowledge/api.md: REST API reference.
- knowledge/cli.md: CLI guide.
Capabilities
1. Dremio CLI
Use the CLI for administrative tasks, content management, and CI/CD workflows.
- Reference: knowledge/cli (see details in knowledge-tree.md)
- Bootstrapping:
- If a command fails with "No profile found", automatically suggest creating one using the environment variables from .env (python.md standards).
- Example Command:
bash
# Cloud
dremio profile create --name cloud --base-url $DREMIO_ENDPOINT --token $DREMIO_PAT --project-id $DREMIO_PROJECT_ID
- Common Tasks:
- Constructing connection profiles.
- Exporting/Importing catalog content.
- Managing users and roles (if standard CLI).
2. Dremio Python SDK (dremioframe)
Use the Python SDK for scripting data operations, automation, and data engineering workflows.
- Reference: knowledge/python/ (See ingestion_overview.md, transformation_guide.md, etc.)
- Import Pattern: from dremioframe.simple import DremioClient
- Key Features:
- Authenticate using PAT or Username/Password.
- client.query_to_pandas(sql) for dataframes.
- client.catalog.create_source(), client.catalog.get() for metadata.
3. SQL
Use Dremio SQL for querying data, manipulating Iceberg tables, and defining views.
- Reference: knowledge/sql/ (See sql_commands.md, iceberg_functions.md)
- Key Features:
- ANSI SQL compliant.
- Iceberg DML (UPDATE, DELETE, MERGE, OPTIMIZE).
- Metadata queries (SELECT * FROM TABLE(table_history(...))).
4. REST API
Use the REST API for lower-level integrations or when the SDK/CLI does not cover a specific feature.
- Reference: knowledge/api.md
- Base URL: https://api.dremio.cloud/v0/ (Cloud) or Software equivalent.
5. Task Wizards (Workflows)
Use these step-by-step guides when the user asks for high-level architectural help (e.g. "How do I build a lakehouse?").
- Reference: wizards/wizard-tree.md
- Available Wizards:
- Semantic Layer (Medallion Architecture)
- Reflection Strategy (Performance)
- Source Onboarding
- Query Triage (Debugging)
- Iceberg Maintenance (Optimize/Vacuum)
- Security Model (RBAC/RLS)
- Workload Management (Queues)
- Data Quality (Validation)
- Visualization Guide
Environment & Configuration
The following environment variables are available in template.env and the user should rename this file to .env and add it to their .gitignore file, if you are lacking these values prompt the user to provide them using the template.env file as a reference. These variables should be used to initialize clients:
| Variable | Description | Usage in Python (dremioframe) |
Usage in CLI |
|---|---|---|---|
DREMIO_ENDPOINT |
Coordinator URL (Cloud) | DremioClient(endpoint=os.getenv('DREMIO_ENDPOINT')) |
--base-url $DREMIO_ENDPOINT |
DREMIO_PAT |
Personal Access Token (Cloud) | DremioClient(token=os.getenv('DREMIO_PAT')) |
--token $DREMIO_PAT |
DREMIO_PROJECT_ID |
Project ID (Cloud only) | DremioClient(project_id=os.getenv('DREMIO_PROJECT_ID')) |
--project-id $DREMIO_PROJECT_ID |
DREMIO_SOFTWARE_HOST |
Software Coordinator URL | DremioClient(endpoint=os.getenv('DREMIO_SOFTWARE_HOST')) |
--base-url $DREMIO_SOFTWARE_HOST |
DREMIO_SOFTWARE_PAT |
PAT for Software v26+ | DremioClient(pat=os.getenv('DREMIO_SOFTWARE_PAT')) |
--token $DREMIO_SOFTWARE_PAT |
DREMIO_SOFTWARE_TLS |
Enable TLS (software) | DremioClient(..., tls=os.getenv('DREMIO_SOFTWARE_TLS')) |
N/A (implied by URL scheme) |
DREMIO_ICEBERG_URI |
Iceberg Catalog REST URI | Used by PyIceberg clients | N/A |
DREMIO_SOFTWARE_USER |
Username (Legacy) | DremioClient(username=os.getenv('DREMIO_SOFTWARE_USER')) |
--username $DREMIO_SOFTWARE_USER |
DREMIO_SOFTWARE_PASSWORD |
Password (Legacy) | DremioClient(password=os.getenv('DREMIO_SOFTWARE_PASSWORD')) |
--password $DREMIO_SOFTWARE_PASSWORD |
Resources & Assets
- Examples: Check
examples/for "Golden Path" code snippets:python/etl_job.py: Full ETL workflow.sql/reflection_management.sql: Best practices for acceleration.cli/backup_script.sh: Backup automation.
- Diagnostic Tool: Run
python dremio-skill/scripts/validate_conn.pyto diagnose connection issues. - Terminology: Consult
knowledge/glossary.mdfor definitions of VDS, PDS, Reflections, etc.
Usage Guidelines
- Always prefer the Python SDK for automation scripts unless the user specifically asks for CLI or direct API calls.
- Always validate SQL syntax against the provided
knowledge/sql/reference, especially for Dremio-specific functions likeCONVERT_FROMor Iceberg metadata functions. - Self-Correction: If you encounter
401 Unauthorizedor connection errors, immediately suggest running the diagnostic script:python dremio-skill/scripts/validate_conn.py. - Context Awareness: Use the glossary to ensure you use correct terms (e.g., "Promote PDS" via
dremio-cliordremioframe). - Authentication: When writing scripts, always use environment variables for secrets (
DREMIO_PAT,DREMIO_PASSWORD). Never hardcode credentials.
Example Workflow (Python)
import os
from dremioframe.simple import DremioClient
# Initialize
client = DremioClient(
endpoint="https://api.dremio.cloud",
token=os.getenv("DREMIO_TOKEN")
)
# Query
df = client.query_to_pandas("SELECT * FROM Samples.\"samples.dremio.com\".\"NYC-taxi-trips\" LIMIT 10")
print(df.head())
# Supported AI Coding Agents
This skill is compatible with the SKILL.md standard and works with all major AI coding agents:
Learn more about the SKILL.md standard and how to use these skills with your preferred AI coding agent.