acedergren

OCI Oracle DBA and DevOps

1
1
# Install this skill:
npx skills add acedergren/oci-agent-skills --skill "OCI Oracle DBA and DevOps"

Install specific skill from multi-skill repository

# Description

Use when managing Oracle Autonomous Database on OCI, troubleshooting performance issues, optimizing costs, or implementing HA/DR. Covers ADB-specific gotchas, cost traps, SQL_ID debugging workflows, auto-scaling behavior, and version differences (19c/21c/23ai/26ai). Keywords: slow query, SQL_ID, ECPU scaling, stopped ADB cost, ADMIN user restrictions, auto-scaling limits.

# SKILL.md


name: OCI Oracle DBA and DevOps
description: Use when managing Oracle Autonomous Database on OCI, troubleshooting performance issues, optimizing costs, or implementing HA/DR. Covers ADB-specific gotchas, cost traps, SQL_ID debugging workflows, auto-scaling behavior, and version differences (19c/21c/23ai/26ai). Keywords: slow query, SQL_ID, ECPU scaling, stopped ADB cost, ADMIN user restrictions, auto-scaling limits.
version: 2.0.0


OCI Oracle DBA - Expert Knowledge

πŸ—οΈ Use OCI Landing Zone Terraform Modules

Don't reinvent the wheel. Use oracle-terraform-modules/landing-zone for database infrastructure.

Landing Zone solves:
- ❌ Bad Practice #1: Generic compartments (Landing Zone creates dedicated Database/Security compartments for ADB organization)
- ❌ Bad Practice #9: Public database endpoints (Landing Zone Security Zones enforce private endpoints only)
- ❌ Bad Practice #10: No monitoring (Landing Zone auto-configures ADB performance alarms, slow query notifications)

This skill provides: ADB-specific operations, performance tuning, and cost optimization for databases deployed WITHIN a Landing Zone.


⚠️ OCI CLI/API Knowledge Gap

You don't know OCI CLI commands or OCI API structure.

Your training data has limited and outdated knowledge of:
- OCI CLI syntax and parameters (updates monthly)
- OCI API endpoints and request/response formats
- Autonomous Database CLI operations (oci db autonomous-database)
- OCI service-specific commands and flags
- Latest OCI features and API changes

When OCI operations are needed:
1. Use exact CLI commands from this skill's references
2. Do NOT guess OCI CLI syntax or parameters
3. Do NOT assume API endpoint structures
4. Load oci-cli-adb.md for ADB management operations

What you DO know:
- Oracle Database internals (SQL, PL/SQL, performance tuning)
- General cloud concepts
- Database administration principles

This skill bridges the gap by providing current OCI CLI/API commands for Autonomous Database operations.


You are an Oracle Autonomous Database expert on OCI. This skill provides knowledge Claude lacks: ADB-specific behaviors, cost traps, SQL_ID debugging workflows, auto-scaling gotchas, and production anti-patterns.

NEVER Do This

❌ NEVER use ADMIN user in application code

-- WRONG - application uses ADMIN credentials
app_config = {'user': 'ADMIN', 'password': admin_pwd}

-- RIGHT - create app-specific user with least privilege
CREATE USER app_user IDENTIFIED BY :password;
GRANT CREATE SESSION, SELECT ON schema.* TO app_user;

Why critical: ADMIN has full database control, audit trail shows all actions as ADMIN (no accountability), ADMIN can't be locked/disabled without breaking automation.

❌ NEVER scale without checking wait events first

-- WRONG decision path: "CPU is high β†’ scale ECPUs"

-- RIGHT decision path:
1. Check v$system_event for top wait events
2. High 'CPU time' wait β†’ Bad SQL, need optimization (DON'T scale)
3. High 'db file sequential read' β†’ Missing indexes (DON'T scale)
4. High 'User I/O' sustained β†’ Scale storage IOPS OR auto-scaling
5. Only scale ECPUs if: CPU wait sustained + SQL already optimized

Cost impact: Scaling 2β†’4 ECPU = $526/month increase. If root cause is bad SQL, wasted $526/month.

❌ NEVER assume stopped ADB = zero cost

Stopped Autonomous Database charges:
βœ“ Compute: $0 (stopped)
βœ— Storage: $0.025/GB/month continues
βœ— Backups: Retention charges continue

Example: 1TB ADB stopped for 30 days
Storage: 1000 GB Γ— $0.025 = $25/month (CHARGED!)

Better for long-term idle (>60 days):
1. Export data (Data Pump)
2. Delete ADB
3. Restore from backup when needed

❌ NEVER forget retention on manual backups (cost trap)

# WRONG - manual backup with no retention (kept forever)
oci db autonomous-database-backup create \
  --autonomous-database-id $ADB_ID \
  --display-name "pre-upgrade-backup"
# Cost: $0.025/GB/month FOREVER

# RIGHT - set retention
oci db autonomous-database-backup create \
  --autonomous-database-id $ADB_ID \
  --display-name "pre-upgrade-backup" \
  --retention-days 30

Cost trap: 1TB manual backup Γ— $0.025/GB/month Γ— 12 months = $300/year waste

❌ NEVER use SELECT * in production queries

-- WRONG - fetches all columns, heavy network/parsing
SELECT * FROM orders WHERE customer_id = :cust_id;

-- RIGHT - specify needed columns
SELECT order_id, total_amount, status FROM orders WHERE customer_id = :cust_id;

Impact: 50-column table, fetching 5 needed columns
- SELECT *: 50 columns Γ— 1000 rows = 50k data points
- Explicit: 5 columns Γ— 1000 rows = 5k data points (90% reduction)

❌ NEVER ignore SQL_ID when debugging slow queries

-- WRONG - "my query is slow, tune the database"
ALTER SYSTEM SET optimizer_mode = 'FIRST_ROWS';  # Affects ALL queries!

-- RIGHT - identify specific SQL_ID, tune that query
SELECT sql_id, elapsed_time/executions/1000 AS avg_ms, executions
FROM v$sql
WHERE executions > 0
ORDER BY elapsed_time DESC
FETCH FIRST 10 ROWS ONLY;

Then tune specific SQL_ID (not entire database)

❌ NEVER use ROWNUM with ORDER BY (wrong results)

-- WRONG - ROWNUM applied BEFORE ORDER BY (wrong top 10)
SELECT * FROM orders WHERE ROWNUM <= 10 ORDER BY created_at DESC;

-- RIGHT - FETCH FIRST (Oracle 12c+)
SELECT * FROM orders ORDER BY created_at DESC FETCH FIRST 10 ROWS ONLY;

❌ NEVER scale auto-scaling ADB without checking current behavior

ADB Auto-Scaling Gotcha:
- Base ECPU: 2
- Auto-scaling: Scales 1-3x (2 β†’ 6 ECPU max)
- Cost: Charged for PEAK usage during period

# WRONG - enable auto-scaling then forget about it
Cost surprise: Base 2 ECPU ($526/month) β†’ Peak 6 ECPU ($1,578/month)

# RIGHT - set max ECPU limit in console
Max ECPU = 4 (2Γ— base, not 3Γ—)
Cost control: Peak 4 ECPU ($1,052/month) max

Performance Troubleshooting Decision Tree

"Queries are slow"?
β”‚
β”œβ”€ Is it ONE query or ALL queries?
β”‚  β”œβ”€ ONE query slow
β”‚  β”‚  └─ Get SQL_ID from v$sql (top by elapsed_time)
β”‚  β”‚     └─ Check execution plan:
β”‚  β”‚        SELECT * FROM TABLE(DBMS_XPLAN.DISPLAY_CURSOR('&sql_id'));
β”‚  β”‚        β”œβ”€ Full table scan? β†’ Add index
β”‚  β”‚        β”œβ”€ Wrong join order? β†’ Use hints or SQL Plan Management
β”‚  β”‚        └─ Cartesian join? β†’ Fix query logic
β”‚  β”‚
β”‚  └─ ALL queries slow (system-wide)
β”‚     └─ Check wait events:
β”‚        SELECT event, time_waited_micro/1000000 AS wait_sec
β”‚        FROM v$system_event
β”‚        WHERE wait_class != 'Idle'
β”‚        ORDER BY time_waited_micro DESC
β”‚        FETCH FIRST 10 ROWS ONLY;
β”‚
β”‚        β”œβ”€ Top wait: 'CPU time' β†’ Optimize SQL OR scale ECPU
β”‚        β”œβ”€ Top wait: 'db file sequential read' β†’ Missing indexes
β”‚        β”œβ”€ Top wait: 'db file scattered read' β†’ Full table scans
β”‚        β”œβ”€ Top wait: 'log file sync' β†’ Too many commits (batch)
β”‚        └─ Top wait: 'User I/O' β†’ Scale storage IOPS or auto-scale
β”‚
└─ When did slowness start?
   β”œβ”€ After schema change? β†’ Gather stats (DBMS_STATS)
   β”œβ”€ After data load? β†’ Gather stats + check partitioning
   β”œβ”€ After version upgrade? β†’ Check execution plan changes
   └─ Gradual over time? β†’ Data growth, need indexing/partitioning

ADB Cost Calculations (Exact)

ECPU Scaling Cost

License-Included pricing: $0.36/ECPU-hour
BYOL pricing: $0.18/ECPU-hour (if you have Oracle licenses)

Monthly cost = ECPU count Γ— hourly rate Γ— 730 hours

Examples:
2 ECPU: 2 Γ— $0.36 Γ— 730 = $526/month
4 ECPU: 4 Γ— $0.36 Γ— 730 = $1,052/month
8 ECPU: 8 Γ— $0.36 Γ— 730 = $2,104/month

BYOL (50% off):
2 ECPU: 2 Γ— $0.18 Γ— 730 = $263/month
4 ECPU: 4 Γ— $0.18 Γ— 730 = $526/month

Storage Cost

Storage pricing: $0.025/GB/month (all tiers: Standard, Archive)

Examples:
1 TB: 1000 GB Γ— $0.025 = $25/month
5 TB: 5000 GB Γ— $0.025 = $125/month

CRITICAL: Storage charged even when ADB stopped!

Auto-Scaling Cost Impact

Scenario: Base 2 ECPU with auto-scaling enabled (1-3Γ—)

Without auto-scaling:
2 ECPU Γ— $0.36 Γ— 730 = $526/month (fixed)

With auto-scaling (spiky load):
- 50% of time: 2 ECPU = $263
- 30% of time: 4 ECPU = $315
- 20% of time: 6 ECPU = $315
Monthly cost: $893 (70% increase)

When auto-scaling makes sense:
- Spiky load (not sustained high)
- Want to avoid manual scaling
- Cost increase acceptable (up to 3Γ—)

SQL_ID Debugging Workflow

Step 1: Find problem SQL_ID

SELECT sql_id,
       elapsed_time/executions/1000 AS avg_ms,
       executions,
       sql_text
FROM v$sql
WHERE executions > 0
  AND last_active_time > SYSDATE - 1/24  -- Last hour
ORDER BY elapsed_time DESC
FETCH FIRST 10 ROWS ONLY;

Step 2: Get execution plan

SELECT * FROM TABLE(DBMS_XPLAN.DISPLAY_CURSOR('&sql_id'));

Step 3: Analyze plan issues
- TABLE ACCESS FULL on large table β†’ Missing index
- NESTED LOOPS with high cardinality β†’ Wrong join method
- HASH JOIN OUTER β†’ Consider index join

Step 4: Create SQL Tuning Task

DECLARE
  task_name VARCHAR2(30);
BEGIN
  task_name := DBMS_SQLTUNE.CREATE_TUNING_TASK(
    sql_id => '&sql_id',
    task_name => 'tune_slow_query'
  );
  DBMS_SQLTUNE.EXECUTE_TUNING_TASK(task_name);
END;
/

-- Get recommendations
SELECT DBMS_SQLTUNE.REPORT_TUNING_TASK('tune_slow_query') FROM DUAL;

Step 5: Implement fix
- Recommendation: Add index β†’ Create index
- Recommendation: Use hint β†’ Test with hint, then SQL Plan Baseline
- Recommendation: Gather stats β†’ EXEC DBMS_STATS.GATHER_TABLE_STATS

ADB-Specific Behaviors (OCI Gotchas)

Auto-Scaling Limits

Auto-scaling rules (cannot change):
- Minimum: 1Γ— base ECPU
- Maximum: 3Γ— base ECPU
- Scaling trigger: CPU > 80% for 5+ minutes
- Scale-down: CPU < 60% for 10+ minutes
- Time to scale: 5-10 minutes

Example: Base 2 ECPU
- Can scale: 2 β†’ 4 β†’ 6 ECPU
- Cannot scale: Beyond 6 ECPU (hard limit)
- Cost: Pay for peak usage each hour

ADMIN User Restrictions

In Autonomous Database, ADMIN user:
βœ“ Can: Create users, grant roles, DDL operations
βœ— Cannot: Create tablespaces (DATA is auto-managed)
βœ— Cannot: Modify SYSTEM/SYSAUX tablespaces
βœ— Cannot: Access OS (no shell, no file system)
βœ— Cannot: Use SYSDBA privileges (not available in ADB)

For applications:
- ADMIN: Only for database setup/maintenance
- App users: Create dedicated users with minimal grants

Service Name Performance Impact

ADB provides 3 service names per database:

| Service | CPU Allocation | Concurrency | Use For |
|---------|---------------|-------------|---------|
| HIGH | Dedicated OCPU | 1Γ— ECPU | Interactive queries, OLTP |
| MEDIUM | Shared OCPU | 2Γ— ECPU | Reporting, batch jobs |
| LOW | Most sharing | 3Γ— ECPU | Background tasks, ETL |

Cost: All service names use same ECPU pool (no extra cost)
Performance: HIGH is faster but limits concurrency
Gotcha: Using HIGH for background jobs wastes resources

Backup Retention (Automatic vs Manual)

Automatic backups (free, included):
- Frequency: Daily incremental, weekly full
- Retention: 60 days default (configurable 1-60)
- Cost: Included in ADB storage cost
- Deletion: Automatic after retention period

Manual backups (charged separately):
- Frequency: On-demand
- Retention: FOREVER (until you delete)
- Cost: $0.025/GB/month
- Deletion: Manual only

Cost trap: 10 manual backups Γ— 1TB Γ— $0.025/GB/month = $250/month
Recommendation: Use automatic backups, manual only for long-term archival

Version-Specific Features (Know Which ADB Version)

Feature 19c 21c 23ai 26ai When to Use
JSON Relational Duality - - βœ“ βœ“ Modern apps (REST + SQL)
AI Vector Search - - βœ“ βœ“ RAG, semantic search
JavaScript Stored Procs - - - βœ“ Node.js developers
SELECT AI - - βœ“ βœ“ Natural language β†’ SQL
Property Graphs - βœ“ βœ“ βœ“ Fraud detection, social
True Cache - - - βœ“ Read-heavy workloads
Blockchain Tables - βœ“ βœ“ βœ“ Immutable audit log

Upgrade path: 19c β†’ 21c β†’ 23ai β†’ 26ai
Downgrade: NOT supported (cannot go back)
Recommendation: Test in clone before upgrading production

Common ADB Errors Decoded

Error Message Actual Cause Solution
ORA-01017: invalid username/password Wallet password wrong OR expired credentials Re-download wallet, check password
ORA-12170: Connect timeout Network issue OR wrong service name Check NSG rules, verify tnsnames.ora
ORA-00604: error at recursive SQL level 1 Automated task failed (stats gather, space mgmt) Check DBA_SCHEDULER_JOB_RUN_DETAILS
ORA-30036: unable to extend segment Tablespace full (DATA auto-managed) ADB auto-extends, if error persists β†’ contact support
ORA-01031: insufficient privileges ADMIN user trying restricted operation Use ADMIN only for allowed operations (see restrictions)

Advanced Operations (Progressive Loading)

SQLcl Direct Database Access

WHEN TO LOAD sqlcl-workflows.md:
- Need to execute SQL queries directly via Bash
- Want to get execution plans, wait events, or active sessions
- Performing SQL tuning tasks (DBMS_SQLTUNE)
- Exporting/importing data with Data Pump
- Generating DDL for schema objects

Example: Finding top SQL by elapsed time

sql admin/password@adb_high <<EOF
SELECT sql_id, elapsed_time/executions/1000 AS avg_ms
FROM v\$sql WHERE executions > 0
ORDER BY elapsed_time DESC FETCH FIRST 10 ROWS ONLY;
EXIT;
EOF

Do NOT load for:
- Standard troubleshooting advice - covered in this skill's decision trees
- Cost calculations - exact formulas provided above
- Anti-patterns - NEVER list covers common mistakes


OCI CLI for ADB Management

WHEN TO LOAD oci-cli-adb.md:
- Need to provision, scale, or delete ADB instances
- Creating backups or clones (full vs metadata)
- Downloading wallet files
- Changing configuration (auto-scaling, license type, version upgrades)
- Batch operations across multiple ADBs

Example: Scale ADB from 2 to 4 ECPUs

oci db autonomous-database update \
  --autonomous-database-id ocid1.autonomousdatabase.oc1..xxx \
  --cpu-core-count 4 \
  --wait-for-state AVAILABLE

Example: Create metadata clone (70% cheaper - schema only, no data)

oci db autonomous-database create-from-clone \
  --source-id ocid1.autonomousdatabase.oc1..xxx \
  --display-name "dev-schema" \
  --db-name "DEVSCHEMA" \
  --clone-type METADATA \
  --wait-for-state AVAILABLE

Do NOT load for:
- SQL operations (use SQLcl instead)
- Performance analysis (v$sql queries covered in this skill)
- Cost formulas (exact calculations provided above)


OCI Autonomous Database Best Practices (Official Oracle Documentation)

WHEN TO LOAD oci-adb-best-practices.md:
- Need comprehensive ADB architecture and design patterns
- Understanding ADB workload types (ATP, ADW, APEX, JSON)
- Implementing production-grade ADB deployments
- Need official Oracle guidance on ADB features and limitations
- Planning migrations to ADB from on-premises Oracle

Do NOT load for:
- Quick SQL_ID debugging (workflow in this skill)
- Cost calculations (exact formulas above)
- Common gotchas (NEVER list covers them)


When to Use This Skill

  • Performance issues: Slow queries, high CPU, scaling decisions
  • Cost optimization: ECPU sizing, stopped ADB charges, backup retention
  • Debugging: SQL_ID workflow, wait events, execution plans
  • Auto-scaling: When to enable, cost impact, limits
  • Version planning: Feature comparison (19c vs 26ai), upgrade timing
  • Security: ADMIN restrictions, user setup, service name selection

# Supported AI Coding Agents

This skill is compatible with the SKILL.md standard and works with all major AI coding agents:

Learn more about the SKILL.md standard and how to use these skills with your preferred AI coding agent.