vsvale

unity-catalog

0
0
# Install this skill:
npx skills add vsvale/skills_databricks_assistent_agent --skill "unity-catalog"

Install specific skill from multi-skill repository

# Description

Manage and work with Unity Catalog in Databricks. Unity Catalog is the unified governance solution for data and AI assets. Use this skill when creating catalogs, schemas, tables, managing permissions, working with external locations, storage credentials, data sharing, or implementing Unity Catalog best practices.

# SKILL.md


name: unity-catalog
description: Manage and work with Unity Catalog in Databricks. Unity Catalog is the unified governance solution for data and AI assets. Use this skill when creating catalogs, schemas, tables, managing permissions, working with external locations, storage credentials, data sharing, or implementing Unity Catalog best practices.


Unity Catalog Management

Unity Catalog provides centralized governance for data and AI assets across Databricks workspaces. This skill helps you manage catalogs, schemas, tables, permissions, and other Unity Catalog resources.

Common Operations

Creating Catalogs

Catalogs are the top-level container in Unity Catalog's three-level namespace: catalog.schema.table.

-- Create a catalog
CREATE CATALOG IF NOT EXISTS my_catalog
COMMENT 'Production data catalog';

-- Create a catalog with managed storage location
CREATE CATALOG IF NOT EXISTS my_catalog
LOCATION 's3://my-bucket/catalog/'
COMMENT 'Catalog with managed storage';

Creating Schemas

Schemas organize tables within a catalog.

-- Create a schema in a catalog
CREATE SCHEMA IF NOT EXISTS my_catalog.my_schema
COMMENT 'Schema for analytics tables';

-- Create a schema with managed storage location
CREATE SCHEMA IF NOT EXISTS my_catalog.my_schema
LOCATION 's3://my-bucket/schema/'
COMMENT 'Schema with managed storage';

Creating Tables

Tables can be managed (Unity Catalog manages the data) or external (data stored externally).

-- Create a managed table
CREATE TABLE IF NOT EXISTS my_catalog.my_schema.my_table (
  id INT,
  name STRING,
  created_at TIMESTAMP
)
COMMENT 'Managed table example';

-- Create an external table
CREATE TABLE IF NOT EXISTS my_catalog.my_schema.external_table
LOCATION 's3://my-bucket/data/'
COMMENT 'External table example';

-- Create a table from existing data
CREATE TABLE IF NOT EXISTS my_catalog.my_schema.sales_data
AS SELECT * FROM legacy_database.sales;

Managing Permissions

Unity Catalog uses a fine-grained permission model. Common privileges include:

  • Catalog level: USE CATALOG, CREATE SCHEMA, ALL PRIVILEGES
  • Schema level: USE SCHEMA, CREATE TABLE, CREATE FUNCTION, ALL PRIVILEGES
  • Table level: SELECT, MODIFY, ALL PRIVILEGES
-- Grant catalog privileges
GRANT USE CATALOG ON CATALOG my_catalog TO `[email protected]`;

-- Grant schema privileges
GRANT USE SCHEMA, CREATE TABLE ON SCHEMA my_catalog.my_schema TO `[email protected]`;

-- Grant table privileges
GRANT SELECT ON TABLE my_catalog.my_schema.my_table TO `[email protected]`;

-- Grant all privileges
GRANT ALL PRIVILEGES ON SCHEMA my_catalog.my_schema TO `[email protected]`;

External Locations and Storage Credentials

External locations allow Unity Catalog to access data in external storage systems.

-- Create a storage credential (requires admin privileges)
CREATE STORAGE CREDENTIAL IF NOT EXISTS my_credential
WITH AWS IAM ROLE 'arn:aws:iam::123456789:role/databricks-role';

-- Create an external location
CREATE EXTERNAL LOCATION IF NOT EXISTS my_location
URL 's3://my-bucket/data/'
WITH (STORAGE CREDENTIAL my_credential);

-- Grant access to external location
GRANT READ FILES ON EXTERNAL LOCATION my_location TO `[email protected]`;

Data Sharing

Unity Catalog supports Delta Sharing for secure data sharing.

-- Create a share
CREATE SHARE IF NOT EXISTS my_share
COMMENT 'Share for external partners';

-- Add table to share
ALTER SHARE my_share ADD TABLE my_catalog.my_schema.my_table;

-- Grant access to share
GRANT SELECT ON SHARE my_share TO `[email protected]`;

Best Practices

  1. Naming Conventions
  2. Use lowercase with underscores: my_catalog, analytics_schema
  3. Be descriptive and consistent
  4. Avoid special characters and spaces

  5. Organization

  6. Group related schemas within catalogs
  7. Use catalogs to separate environments (dev, staging, prod)
  8. Organize by domain or team ownership

  9. Permissions

  10. Follow principle of least privilege
  11. Grant at the appropriate level (catalog, schema, or table)
  12. Use groups for easier permission management
  13. Document permission requirements

  14. Storage

  15. Use managed storage for new tables when possible
  16. Use external locations for existing data in cloud storage
  17. Consider data retention and lifecycle policies

  18. Migration

  19. Migrate from Hive metastore to Unity Catalog gradually
  20. Test permissions and access patterns
  21. Update applications and queries to use three-level namespace

Common Queries

-- List all catalogs
SHOW CATALOGS;

-- List schemas in a catalog
SHOW SCHEMAS IN CATALOG my_catalog;

-- List tables in a schema
SHOW TABLES IN my_catalog.my_schema;

-- Describe a table
DESCRIBE EXTENDED my_catalog.my_schema.my_table;

-- Show grants on a catalog
SHOW GRANTS ON CATALOG my_catalog;

-- Show grants on a schema
SHOW GRANTS ON SCHEMA my_catalog.my_schema;

-- Show grants on a table
SHOW GRANTS ON TABLE my_catalog.my_schema.my_table;

-- List external locations
SHOW EXTERNAL LOCATIONS;

-- List storage credentials
SHOW STORAGE CREDENTIALS;

Troubleshooting

  • Permission denied errors: Check grants at catalog, schema, and table levels
  • Table not found: Verify three-level namespace format: catalog.schema.table
  • External location access: Ensure storage credentials are properly configured
  • Migration issues: Check compatibility between Hive metastore and Unity Catalog

For more details, consult the references/REFERENCES.md.

# Supported AI Coding Agents

This skill is compatible with the SKILL.md standard and works with all major AI coding agents:

Learn more about the SKILL.md standard and how to use these skills with your preferred AI coding agent.