Refactor high-complexity React components in Dify frontend. Use when `pnpm analyze-component...
npx skills add acedergren/oci-agent-skills --skill "OCI Compute Management"
Install specific skill from multi-skill repository
# Description
Use when launching OCI compute instances, troubleshooting "out of capacity" or boot failures, optimizing compute costs, or handling instance lifecycle. Covers shape selection, capacity planning, service limits, and production incident resolution. Keywords: VM.Standard shapes, out of host capacity, instance principal, console connection, service limits.
# SKILL.md
name: OCI Compute Management
description: Use when launching OCI compute instances, troubleshooting "out of capacity" or boot failures, optimizing compute costs, or handling instance lifecycle. Covers shape selection, capacity planning, service limits, and production incident resolution. Keywords: VM.Standard shapes, out of host capacity, instance principal, console connection, service limits.
version: 2.0.0
OCI Compute Management - Expert Knowledge
ποΈ Use OCI Landing Zone Terraform Modules
Don't reinvent the wheel. Use oracle-terraform-modules/landing-zone for production deployments.
Landing Zone solves:
- β Bad Practice #5: Internet-wide open ports (0.0.0.0/0 on 22/3389)
- β Bad Practice #9: Public compute instances (Security Zones enforce private IPs)
- β Bad Practice #10: No monitoring (auto-configures alarms and notifications)
This skill provides: Anti-patterns and troubleshooting for compute resources deployed WITHIN a Landing Zone architecture.
β οΈ OCI CLI/API Knowledge Gap
You don't know OCI CLI commands or OCI API structure.
Your training data has limited and outdated knowledge of:
- OCI CLI syntax and parameters (updates monthly)
- OCI API endpoints and request/response formats
- Compute service CLI operations (oci compute instance)
- OCI service-specific commands and flags
- Latest OCI features and API changes
When OCI operations are needed:
1. Use exact CLI commands from this skill's references
2. Do NOT guess OCI CLI syntax or parameters
3. Do NOT assume API endpoint structures
4. Load reference files for detailed CLI operations
What you DO know:
- General cloud compute concepts
- Instance sizing and capacity planning principles
- Linux/Windows system administration
This skill bridges the gap by providing current OCI CLI/API commands for compute operations.
You are an OCI compute expert. This skill provides knowledge Claude lacks from training data: anti-patterns, capacity planning, cost optimization specifics, and OCI-specific gotchas.
NEVER Do This
β NEVER launch instances without checking service limits first
oci limits resource-availability get \
--service-name compute \
--limit-name "standard-e4-core-count" \
--compartment-id <ocid> \
--availability-domain <ad>
87% of "out of capacity" errors are actually quota limits, not infrastructure capacity. Check limits BEFORE launching to get accurate error messages.
β NEVER use console serial connection as primary access
- Creates security audit findings (bypasses SSH key controls)
- Use only for boot troubleshooting when SSH fails
- Delete connection immediately after troubleshooting
β NEVER mix regional and AD-specific resources in templates
- Breaks portability when moving between regions
- Use AD-agnostic designs: spread via fault domains, not hardcoded ADs
β NEVER use default security lists in production
- Default allows 0.0.0.0/0 on all ports
- Fails security audits, creates compliance violations
- Always create custom security lists or NSGs
β NEVER forget boot volume preservation in dev/test
# When terminating test instances, add:
oci compute instance terminate --instance-id <id> --preserve-boot-volume false
Without this flag: $50+/month per deleted instance (orphaned boot volumes)
β NEVER enable public IP on production instances
- Use bastion service or private endpoints for access
- Cost impact: $500-5000+ per security incident from exposed instances
- Landing Zone Security Zones automatically block this pattern
Capacity Error Decision Tree
"Out of host capacity for shape X"?
β
ββ Check service limits FIRST (87% of cases)
β ββ oci limits resource-availability get
β ββ available = 0 β Request limit increase (NOT capacity issue)
β ββ available > 0 β True capacity issue, continue below
β
ββ Same shape, different AD?
β ββ Try each AD in region (PHX has 3, IAD has 3, each independent)
β
ββ Different shape, same series?
β ββ E4 failed β try E5 (newer gen, often more capacity)
β ββ Standard failed β try Optimized or DenseIO variants
β
ββ Different architecture?
β ββ AMD β ARM (A1.Flex often has capacity when Intel/AMD full)
β
ββ All ADs exhausted?
ββ Create capacity reservation (guarantees future launches)
Shape Selection: Cost vs Performance
Budget-Critical (save 50%):
- VM.Standard.A1.Flex (ARM) if app supports: $0.01/OCPU/hr vs $0.03 (AMD)
- Caveat: Not all software runs on ARM, test thoroughly
General Purpose (balanced):
- VM.Standard.E4.Flex: 2:16 CPU:RAM ratio, $0.03/OCPU/hr
- Start: 2 OCPUs, scale based on metrics (not guesses)
Memory-Intensive (databases, caches):
- VM.Standard.E4.Flex with custom ratio: up to 1:64 CPU:RAM
- Cost: $0.03/OCPU + $0.0015/GB RAM
Cost Trap: Fixed shapes (e.g., VM.Standard2.1) often MORE expensive than Flex with same resources. Always compare Flex pricing first.
Instance Principal Authentication (Production)
When instance needs to call OCI APIs (Object Storage, Vault, etc.):
WRONG (user credentials on instance):
# Don't do this - credential management nightmare
export OCI_USER_OCID="ocid1.user..."
RIGHT (instance principal):
# 1. Create dynamic group
oci iam dynamic-group create \
--name "app-instances" \
--matching-rule "instance.compartment.id = '<compartment-ocid>'"
# 2. Grant permissions
# "Allow dynamic-group app-instances to read object-family in compartment X"
# 3. Code uses instance principal (no credentials needed):
signer = oci.auth.signers.InstancePrincipalsSecurityTokenSigner()
client = oci.object_storage.ObjectStorageClient(config={}, signer=signer)
Benefits: No credential rotation, no secrets to manage, automatic token refresh.
OCI-Specific Gotchas
Availability Domain Names Are Tenant-Specific
- Your AD: "fMgC:US-ASHBURN-AD-1"
- Another tenant: "ErKW:US-ASHBURN-AD-1"
- MUST query your tenant: oci iam availability-domain list
Boot Volume Backups Don't Include Instance Config
- Backup captures disk only, NOT shape/networking/metadata
- For DR: Use custom images (captures everything) or Terraform for infrastructure
Instance Metadata Service Has 3 Versions
- v1: http://169.254.169.254/opc/v1/ (legacy)
- v2: http://169.254.169.254/opc/v2/ (current, requires session token)
- Always use v2 for security (prevents SSRF attacks)
Quick Cost Reference
| Shape Family | $/OCPU/hr | $/GB RAM/hr | Best For |
|---|---|---|---|
| A1.Flex (ARM) | $0.01 | $0.0015 | Cost-critical, ARM-compatible |
| E4.Flex (AMD) | $0.03 | $0.0015 | General purpose |
| E5.Flex (AMD) | $0.035 | $0.0015 | Latest gen, premium perf |
| Optimized3.Flex | $0.025 | $0.0015 | Network-intensive |
Free Tier: 2x AMD VM (1/8 OCPU, 1GB) + 4 ARM cores (24GB total) - always free
Calculation: (OCPUs Γ $0.03 + GB Γ $0.0015) Γ 730 hours/month
Example: 2 OCPU, 16GB = (2Γ$0.03 + 16Γ$0.0015) Γ 730 = $61.32/month
Progressive Loading References
OCI Compute Shapes Reference (Official Oracle Documentation)
WHEN TO LOAD oci-compute-shapes-reference.md:
- Need detailed specifications for specific shapes (memory limits, OCPU counts, network bandwidth)
- Comparing flexible shapes (VM.Standard3.Flex vs E4.Flex vs E5.Flex vs E6.Flex vs A1/A2/A4.Flex)
- Understanding extended memory VM instances
- Researching bare metal shapes (BM.Standard3, BM.Standard.E4/E5/E6, BM.Standard.A1/A4)
- Checking GPU shapes, Dense I/O shapes, or HPC-optimized shapes
- Need official Oracle specifications for shape families
Do NOT load for:
- Quick cost comparisons (use Quick Cost Reference table in this skill)
- "Out of capacity" troubleshooting (decision tree in this skill covers it)
- Shape selection guidance (anti-patterns and recommendations in this skill)
When to Use This Skill
- Launching instances: shape selection, capacity planning
- "Out of capacity" errors: decision tree, limit checking
- Cost optimization: shape comparison, right-sizing
- Security: instance principal setup, console connection proper use
- Troubleshooting: boot failures, connectivity issues
- Production: anti-patterns, operational gotchas
# Supported AI Coding Agents
This skill is compatible with the SKILL.md standard and works with all major AI coding agents:
Learn more about the SKILL.md standard and how to use these skills with your preferred AI coding agent.