Implement GitOps workflows with ArgoCD and Flux for automated, declarative Kubernetes...
npx skills add cosmix/loom --skill "ci-cd"
Install specific skill from multi-skill repository
# Description
Designs and implements CI/CD pipelines for automated testing, building, deployment, and security scanning across multiple platforms. Covers pipeline optimization, test integration, artifact management, and release automation.
# SKILL.md
name: ci-cd
description: Designs and implements CI/CD pipelines for automated testing, building, deployment, and security scanning across multiple platforms. Covers pipeline optimization, test integration, artifact management, and release automation.
trigger-keywords: CI/CD, pipeline, workflow, GitHub Actions, GitLab CI, Jenkins, CircleCI, Travis CI, Azure Pipelines, build, deploy, deployment, release, artifact, stage, job, runner, action, automation, continuous integration, continuous deployment, continuous delivery, container registry, docker build, image push, canary deployment, blue-green deployment, rolling update, rollback, integration tests, smoke tests, security scanning, SAST, DAST, dependency scanning, secrets management, cache optimization, parallelization, monorepo CI, matrix build, self-hosted runner, ML pipeline, model training pipeline, model deployment
allowed-tools: Read, Grep, Glob, Edit, Write, Bash
CI/CD
Overview
This skill covers the complete lifecycle of CI/CD pipeline design, implementation, and optimization across platforms including GitHub Actions, GitLab CI, Jenkins, CircleCI, and cloud-native solutions. It encompasses automated testing integration, security scanning, artifact management, deployment strategies, and specialized pipelines for ML workloads.
When to Use
- Implementing or migrating CI/CD pipelines
- Optimizing build and test execution times
- Integrating security scanning (SAST, DAST, dependency checks)
- Setting up deployment automation with rollback strategies
- Configuring test suites in CI environments
- Managing artifacts and container registries
- Implementing ML model training and deployment pipelines
- Troubleshooting pipeline failures and flakiness
Instructions
1. Analyze Requirements
- Identify build and test requirements
- Determine deployment targets and environments
- Assess security scanning needs (SAST, DAST, secrets, dependencies)
- Plan environment promotion strategy (dev โ staging โ production)
- Define quality gates and approval workflows
- Identify test suite composition (unit, integration, E2E)
- Determine artifact storage and retention policies
2. Design Pipeline Architecture
- Structure stages logically with clear dependencies
- Optimize for speed through parallelization and caching
- Design fail-fast strategy (lint โ unit tests โ integration tests โ build)
- Plan secret management and secure credential handling
- Define deployment strategies (rolling, blue-green, canary)
- Architect for rollback and recovery procedures
- Design matrix builds for multi-platform support
- Plan monorepo CI strategies if applicable
3. Implement Testing Integration
- Configure unit test execution with coverage reporting
- Set up integration tests with service dependencies (databases, APIs)
- Implement E2E/smoke tests for critical user journeys
- Configure test parallelization and sharding
- Integrate test result reporting (JUnit, TAP, JSON)
- Set up flaky test detection and quarantine
- Configure performance/load testing stages
- Implement visual regression testing if applicable
4. Implement Security Scanning
- Integrate SAST (static analysis) tools (SonarQube, CodeQL, Semgrep)
- Configure DAST (dynamic analysis) for deployed environments
- Set up dependency/vulnerability scanning (Dependabot, Snyk, Trivy)
- Implement container image scanning
- Configure secrets detection (GitGuardian, TruffleHog)
- Set up license compliance checking
- Define security gate thresholds and failure policies
5. Implement Build and Artifact Management
- Configure dependency caching strategies
- Implement build output caching and layer caching (Docker)
- Set up artifact versioning and tagging
- Configure container registry integration
- Implement multi-stage builds for optimization
- Set up artifact signing and attestation
- Configure artifact retention and cleanup policies
6. Implement Deployment Automation
- Configure environment-specific deployments
- Implement deployment strategies (rolling, blue-green, canary)
- Set up health checks and readiness probes
- Configure smoke tests post-deployment
- Implement automated rollback on failure
- Set up deployment notifications (Slack, email, PagerDuty)
- Configure manual approval gates for production
7. Optimize Pipeline Performance
- Analyze pipeline execution times and bottlenecks
- Implement job parallelization for independent tasks
- Configure aggressive caching (dependencies, build outputs, Docker layers)
- Optimize test execution (parallel runners, test sharding)
- Use matrix builds efficiently
- Consider self-hosted runners for performance-critical workloads
- Implement conditional job execution (path filters, change detection)
8. Ensure Reliability and Observability
- Add retry logic for transient failures
- Implement comprehensive error handling
- Configure alerts for pipeline failures
- Set up metrics and dashboards for pipeline health
- Document runbooks and troubleshooting procedures
- Implement audit logging for deployments
- Configure SLO tracking for pipeline performance
Best Practices
Core Principles
- Fail Fast: Run cheap, fast checks first (lint, type check, unit tests)
- Parallelize Aggressively: Run independent jobs concurrently
- Cache Everything: Dependencies, build outputs, Docker layers
- Secure by Default: Secrets in vaults, least privilege, audit logs
- Environment Parity: Keep dev/staging/prod as similar as possible
- Immutable Artifacts: Build once, promote everywhere
- Automated Rollback: Every deployment must be reversible
- Idempotent Operations: Pipelines should be safely re-runnable
Testing in CI/CD
- Test Pyramid: More unit tests, fewer integration tests, minimal E2E
- Isolation: Tests should not depend on execution order
- Determinism: Eliminate flaky tests or quarantine them
- Fast Feedback: Unit tests < 5min, full suite < 15min target
- Coverage Gates: Enforce minimum coverage thresholds
- Service Mocking: Use test doubles for external dependencies
Security
- Shift Left: Run security scans early in the pipeline
- Dependency Scanning: Check for CVEs in all dependencies
- Secrets Management: Never hardcode secrets, use secure vaults
- Least Privilege: Minimal permissions for pipeline runners
- Supply Chain Security: Verify and sign artifacts
- Audit Trail: Log all deployments and access
Performance
- Incremental Builds: Only rebuild changed components
- Layer Caching: Optimize Dockerfile layer order
- Dependency Locking: Pin versions for reproducibility
- Resource Limits: Prevent resource exhaustion
- Path Filtering: Skip jobs when irrelevant files change
Examples
Example 1: GitHub Actions Workflow
name: CI/CD Pipeline
on:
push:
branches: [main, develop]
pull_request:
branches: [main]
env:
NODE_VERSION: "20"
REGISTRY: ghcr.io
IMAGE_NAME: ${{ github.repository }}
jobs:
lint:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- name: Setup Node.js
uses: actions/setup-node@v4
with:
node-version: ${{ env.NODE_VERSION }}
cache: "npm"
- name: Install dependencies
run: npm ci
- name: Run linter
run: npm run lint
test:
runs-on: ubuntu-latest
needs: lint
services:
postgres:
image: postgres:16
env:
POSTGRES_PASSWORD: postgres
POSTGRES_DB: test
ports:
- 5432:5432
options: >-
--health-cmd pg_isready
--health-interval 10s
--health-timeout 5s
--health-retries 5
steps:
- uses: actions/checkout@v4
- name: Setup Node.js
uses: actions/setup-node@v4
with:
node-version: ${{ env.NODE_VERSION }}
cache: "npm"
- name: Install dependencies
run: npm ci
- name: Run tests
run: npm test -- --coverage
env:
DATABASE_URL: postgresql://postgres:postgres@localhost:5432/test
- name: Upload coverage
uses: codecov/codecov-action@v3
with:
files: ./coverage/lcov.info
build:
runs-on: ubuntu-latest
needs: test
permissions:
contents: read
packages: write
steps:
- uses: actions/checkout@v4
- name: Set up Docker Buildx
uses: docker/setup-buildx-action@v3
- name: Log in to Container Registry
uses: docker/login-action@v3
with:
registry: ${{ env.REGISTRY }}
username: ${{ github.actor }}
password: ${{ secrets.GITHUB_TOKEN }}
- name: Extract metadata
id: meta
uses: docker/metadata-action@v5
with:
images: ${{ env.REGISTRY }}/${{ env.IMAGE_NAME }}
tags: |
type=ref,event=branch
type=ref,event=pr
type=sha,prefix=
type=raw,value=latest,enable=${{ github.ref == 'refs/heads/main' }}
- name: Build and push
uses: docker/build-push-action@v5
with:
context: .
push: true
tags: ${{ steps.meta.outputs.tags }}
labels: ${{ steps.meta.outputs.labels }}
cache-from: type=gha
cache-to: type=gha,mode=max
deploy-staging:
runs-on: ubuntu-latest
needs: build
if: github.ref == 'refs/heads/develop'
environment: staging
steps:
- uses: actions/checkout@v4
- name: Deploy to staging
uses: azure/k8s-deploy@v4
with:
namespace: staging
manifests: |
k8s/deployment.yaml
k8s/service.yaml
images: |
${{ env.REGISTRY }}/${{ env.IMAGE_NAME }}:${{ github.sha }}
deploy-production:
runs-on: ubuntu-latest
needs: build
if: github.ref == 'refs/heads/main'
environment: production
steps:
- uses: actions/checkout@v4
- name: Deploy to production
uses: azure/k8s-deploy@v4
with:
namespace: production
manifests: |
k8s/deployment.yaml
k8s/service.yaml
images: |
${{ env.REGISTRY }}/${{ env.IMAGE_NAME }}:${{ github.sha }}
strategy: canary
percentage: 20
Example 2: GitLab CI Pipeline
stages:
- validate
- test
- build
- deploy
variables:
DOCKER_TLS_CERTDIR: "/certs"
IMAGE_TAG: $CI_REGISTRY_IMAGE:$CI_COMMIT_SHA
.node-base:
image: node:20-alpine
cache:
key: ${CI_COMMIT_REF_SLUG}
paths:
- node_modules/
lint:
stage: validate
extends: .node-base
script:
- npm ci
- npm run lint
rules:
- if: $CI_PIPELINE_SOURCE == "merge_request_event"
- if: $CI_COMMIT_BRANCH == "main"
test:
stage: test
extends: .node-base
services:
- postgres:16
variables:
POSTGRES_DB: test
POSTGRES_USER: runner
POSTGRES_PASSWORD: runner
DATABASE_URL: postgresql://runner:runner@postgres:5432/test
script:
- npm ci
- npm test -- --coverage
coverage: '/Lines\s*:\s*(\d+\.?\d*)%/'
artifacts:
reports:
coverage_report:
coverage_format: cobertura
path: coverage/cobertura-coverage.xml
junit: junit.xml
build:
stage: build
image: docker:24
services:
- docker:24-dind
script:
- docker login -u $CI_REGISTRY_USER -p $CI_REGISTRY_PASSWORD $CI_REGISTRY
- docker build -t $IMAGE_TAG .
- docker push $IMAGE_TAG
rules:
- if: $CI_COMMIT_BRANCH == "main"
- if: $CI_COMMIT_BRANCH == "develop"
deploy-staging:
stage: deploy
image: bitnami/kubectl:latest
script:
- kubectl set image deployment/app app=$IMAGE_TAG -n staging
- kubectl rollout status deployment/app -n staging --timeout=300s
environment:
name: staging
url: https://staging.example.com
rules:
- if: $CI_COMMIT_BRANCH == "develop"
deploy-production:
stage: deploy
image: bitnami/kubectl:latest
script:
- kubectl set image deployment/app app=$IMAGE_TAG -n production
- kubectl rollout status deployment/app -n production --timeout=300s
environment:
name: production
url: https://example.com
when: manual
rules:
- if: $CI_COMMIT_BRANCH == "main"
Example 3: Reusable Workflow (GitHub Actions)
# .github/workflows/reusable-deploy.yml
name: Reusable Deploy Workflow
on:
workflow_call:
inputs:
environment:
required: true
type: string
image-tag:
required: true
type: string
secrets:
KUBE_CONFIG:
required: true
jobs:
deploy:
runs-on: ubuntu-latest
environment: ${{ inputs.environment }}
steps:
- uses: actions/checkout@v4
- name: Set up kubectl
uses: azure/setup-kubectl@v3
- name: Configure kubeconfig
run: |
mkdir -p ~/.kube
echo "${{ secrets.KUBE_CONFIG }}" | base64 -d > ~/.kube/config
- name: Deploy
run: |
kubectl set image deployment/app \
app=${{ inputs.image-tag }} \
-n ${{ inputs.environment }}
kubectl rollout status deployment/app \
-n ${{ inputs.environment }} \
--timeout=300s
- name: Verify deployment
run: |
kubectl get pods -n ${{ inputs.environment }} -l app=app
kubectl logs -n ${{ inputs.environment }} -l app=app --tail=50
Example 4: Security Scanning Pipeline
name: Security Scanning
on:
push:
branches: [main, develop]
pull_request:
branches: [main]
schedule:
- cron: "0 0 * * 0" # Weekly scan
jobs:
sast:
name: Static Analysis (SAST)
runs-on: ubuntu-latest
permissions:
security-events: write
contents: read
steps:
- uses: actions/checkout@v4
- name: Initialize CodeQL
uses: github/codeql-action/init@v3
with:
languages: javascript, python
- name: Autobuild
uses: github/codeql-action/autobuild@v3
- name: Perform CodeQL Analysis
uses: github/codeql-action/analyze@v3
- name: SonarCloud Scan
uses: SonarSource/sonarcloud-github-action@master
env:
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
SONAR_TOKEN: ${{ secrets.SONAR_TOKEN }}
with:
args: >
-Dsonar.organization=myorg
-Dsonar.projectKey=myproject
-Dsonar.qualitygate.wait=true
dependency-scan:
name: Dependency Vulnerability Scan
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- name: Run Trivy vulnerability scanner
uses: aquasecurity/trivy-action@master
with:
scan-type: "fs"
scan-ref: "."
format: "sarif"
output: "trivy-results.sarif"
severity: "CRITICAL,HIGH"
- name: Upload Trivy results to GitHub Security
uses: github/codeql-action/upload-sarif@v3
with:
sarif_file: "trivy-results.sarif"
- name: Snyk Security Scan
uses: snyk/actions/node@master
env:
SNYK_TOKEN: ${{ secrets.SNYK_TOKEN }}
with:
args: --severity-threshold=high
secrets-scan:
name: Secrets Detection
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
with:
fetch-depth: 0 # Full history for secret detection
- name: TruffleHog Scan
uses: trufflesecurity/trufflehog@main
with:
path: ./
base: ${{ github.event.repository.default_branch }}
head: HEAD
- name: GitGuardian Scan
uses: GitGuardian/ggshield-action@v1
env:
GITHUB_PUSH_BEFORE_SHA: ${{ github.event.before }}
GITHUB_PUSH_BASE_SHA: ${{ github.event.base }}
GITHUB_DEFAULT_BRANCH: ${{ github.event.repository.default_branch }}
GITGUARDIAN_API_KEY: ${{ secrets.GITGUARDIAN_API_KEY }}
container-scan:
name: Container Image Scan
runs-on: ubuntu-latest
needs: [sast, dependency-scan]
steps:
- uses: actions/checkout@v4
- name: Build image
run: docker build -t myapp:${{ github.sha }} .
- name: Scan image with Trivy
uses: aquasecurity/trivy-action@master
with:
image-ref: "myapp:${{ github.sha }}"
format: "sarif"
output: "trivy-image-results.sarif"
- name: Scan image with Grype
uses: anchore/scan-action@v3
with:
image: "myapp:${{ github.sha }}"
fail-build: true
severity-cutoff: high
Example 5: Test Integration with Parallelization
name: Test Suite
on: [push, pull_request]
jobs:
unit-tests:
name: Unit Tests
runs-on: ubuntu-latest
strategy:
matrix:
node-version: [18, 20, 22]
os: [ubuntu-latest, macos-latest, windows-latest]
steps:
- uses: actions/checkout@v4
- name: Setup Node.js ${{ matrix.node-version }}
uses: actions/setup-node@v4
with:
node-version: ${{ matrix.node-version }}
cache: "npm"
- name: Install dependencies
run: npm ci
- name: Run unit tests
run: npm run test:unit -- --coverage --maxWorkers=4
- name: Upload coverage
uses: codecov/codecov-action@v3
with:
files: ./coverage/coverage-final.json
flags: unit-${{ matrix.os }}-node${{ matrix.node-version }}
integration-tests:
name: Integration Tests
runs-on: ubuntu-latest
strategy:
matrix:
shard: [1, 2, 3, 4]
services:
postgres:
image: postgres:16
env:
POSTGRES_PASSWORD: postgres
POSTGRES_DB: test
options: >-
--health-cmd pg_isready
--health-interval 10s
--health-timeout 5s
--health-retries 5
ports:
- 5432:5432
redis:
image: redis:7
options: >-
--health-cmd "redis-cli ping"
--health-interval 10s
--health-timeout 5s
--health-retries 5
ports:
- 6379:6379
steps:
- uses: actions/checkout@v4
- name: Setup Node.js
uses: actions/setup-node@v4
with:
node-version: 20
cache: "npm"
- name: Install dependencies
run: npm ci
- name: Run integration tests (shard ${{ matrix.shard }}/4)
run: npm run test:integration -- --shard=${{ matrix.shard }}/4
env:
DATABASE_URL: postgresql://postgres:postgres@localhost:5432/test
REDIS_URL: redis://localhost:6379
- name: Upload test results
if: always()
uses: actions/upload-artifact@v4
with:
name: integration-test-results-${{ matrix.shard }}
path: test-results/
e2e-tests:
name: E2E Tests
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- name: Setup Node.js
uses: actions/setup-node@v4
with:
node-version: 20
cache: "npm"
- name: Install dependencies
run: npm ci
- name: Install Playwright
run: npx playwright install --with-deps
- name: Build application
run: npm run build
- name: Run E2E tests
run: npm run test:e2e
- name: Upload Playwright report
if: always()
uses: actions/upload-artifact@v4
with:
name: playwright-report
path: playwright-report/
test-report:
name: Generate Test Report
runs-on: ubuntu-latest
needs: [unit-tests, integration-tests, e2e-tests]
if: always()
steps:
- uses: actions/checkout@v4
- name: Download all test results
uses: actions/download-artifact@v4
with:
path: test-results/
- name: Generate combined report
run: |
npm install -g junit-viewer
junit-viewer --results=test-results/ --save=test-report.html
- name: Upload combined report
uses: actions/upload-artifact@v4
with:
name: combined-test-report
path: test-report.html
Example 6: ML Pipeline (Model Training & Deployment)
name: ML Pipeline
on:
push:
branches: [main]
paths:
- "models/**"
- "training/**"
- "data/**"
workflow_dispatch:
inputs:
model-version:
description: "Model version to train"
required: true
type: string
jobs:
data-validation:
name: Validate Training Data
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- name: Setup Python
uses: actions/setup-python@v5
with:
python-version: "3.11"
cache: "pip"
- name: Install dependencies
run: |
pip install pandas great-expectations dvc
- name: Pull data with DVC
run: |
dvc remote modify origin --local auth basic
dvc remote modify origin --local user ${{ secrets.DVC_USER }}
dvc remote modify origin --local password ${{ secrets.DVC_PASSWORD }}
dvc pull
- name: Validate data schema
run: python scripts/validate_data.py
- name: Run Great Expectations
run: great_expectations checkpoint run training_data_checkpoint
train-model:
name: Train ML Model
runs-on: ubuntu-latest
needs: data-validation
steps:
- uses: actions/checkout@v4
- name: Setup Python
uses: actions/setup-python@v5
with:
python-version: "3.11"
cache: "pip"
- name: Install dependencies
run: |
pip install -r requirements.txt
pip install mlflow wandb
- name: Configure MLflow
run: |
echo "MLFLOW_TRACKING_URI=${{ secrets.MLFLOW_TRACKING_URI }}" >> $GITHUB_ENV
echo "MLFLOW_TRACKING_USERNAME=${{ secrets.MLFLOW_USERNAME }}" >> $GITHUB_ENV
echo "MLFLOW_TRACKING_PASSWORD=${{ secrets.MLFLOW_PASSWORD }}" >> $GITHUB_ENV
- name: Train model
run: |
python training/train.py \
--experiment-name "prod-training" \
--model-version ${{ inputs.model-version || github.sha }} \
--config training/config.yaml
env:
WANDB_API_KEY: ${{ secrets.WANDB_API_KEY }}
- name: Upload model artifact
uses: actions/upload-artifact@v4
with:
name: trained-model
path: models/output/
evaluate-model:
name: Evaluate Model Performance
runs-on: ubuntu-latest
needs: train-model
steps:
- uses: actions/checkout@v4
- name: Setup Python
uses: actions/setup-python@v5
with:
python-version: "3.11"
cache: "pip"
- name: Install dependencies
run: pip install -r requirements.txt
- name: Download model
uses: actions/download-artifact@v4
with:
name: trained-model
path: models/output/
- name: Run model evaluation
run: python evaluation/evaluate.py --model-path models/output/
- name: Check performance thresholds
run: |
python evaluation/check_metrics.py \
--min-accuracy 0.85 \
--min-f1 0.80
- name: Generate model card
run: python scripts/generate_model_card.py
deploy-model:
name: Deploy Model to Production
runs-on: ubuntu-latest
needs: evaluate-model
if: github.ref == 'refs/heads/main'
environment: production
steps:
- uses: actions/checkout@v4
- name: Download model
uses: actions/download-artifact@v4
with:
name: trained-model
path: models/output/
- name: Configure AWS credentials
uses: aws-actions/configure-aws-credentials@v4
with:
aws-access-key-id: ${{ secrets.AWS_ACCESS_KEY_ID }}
aws-secret-access-key: ${{ secrets.AWS_SECRET_ACCESS_KEY }}
aws-region: us-east-1
- name: Upload model to S3
run: |
aws s3 cp models/output/model.pkl \
s3://my-ml-models/prod/${{ github.sha }}/model.pkl
- name: Deploy to SageMaker
run: |
python deployment/deploy_sagemaker.py \
--model-uri s3://my-ml-models/prod/${{ github.sha }}/model.pkl \
--endpoint-name prod-ml-endpoint \
--instance-type ml.m5.large
- name: Run smoke tests
run: python deployment/smoke_test.py --endpoint prod-ml-endpoint
- name: Update model registry
run: |
python scripts/register_model.py \
--version ${{ github.sha }} \
--stage production \
--metadata models/output/metadata.json
Example 7: Monorepo CI with Path Filtering
name: Monorepo CI
on:
push:
branches: [main, develop]
pull_request:
branches: [main]
jobs:
detect-changes:
name: Detect Changed Services
runs-on: ubuntu-latest
outputs:
api: ${{ steps.filter.outputs.api }}
web: ${{ steps.filter.outputs.web }}
worker: ${{ steps.filter.outputs.worker }}
shared: ${{ steps.filter.outputs.shared }}
steps:
- uses: actions/checkout@v4
- uses: dorny/paths-filter@v3
id: filter
with:
filters: |
api:
- 'services/api/**'
- 'packages/shared/**'
web:
- 'services/web/**'
- 'packages/shared/**'
worker:
- 'services/worker/**'
- 'packages/shared/**'
shared:
- 'packages/shared/**'
test-api:
name: Test API Service
needs: detect-changes
if: needs.detect-changes.outputs.api == 'true'
runs-on: ubuntu-latest
defaults:
run:
working-directory: services/api
steps:
- uses: actions/checkout@v4
- name: Setup Node.js
uses: actions/setup-node@v4
with:
node-version: 20
cache: "npm"
cache-dependency-path: services/api/package-lock.json
- name: Install dependencies
run: npm ci
- name: Run tests
run: npm test
test-web:
name: Test Web Service
needs: detect-changes
if: needs.detect-changes.outputs.web == 'true'
runs-on: ubuntu-latest
defaults:
run:
working-directory: services/web
steps:
- uses: actions/checkout@v4
- name: Setup Node.js
uses: actions/setup-node@v4
with:
node-version: 20
cache: "npm"
cache-dependency-path: services/web/package-lock.json
- name: Install dependencies
run: npm ci
- name: Run tests
run: npm test
- name: Build
run: npm run build
test-worker:
name: Test Worker Service
needs: detect-changes
if: needs.detect-changes.outputs.worker == 'true'
runs-on: ubuntu-latest
defaults:
run:
working-directory: services/worker
steps:
- uses: actions/checkout@v4
- name: Setup Node.js
uses: actions/setup-node@v4
with:
node-version: 20
cache: "npm"
cache-dependency-path: services/worker/package-lock.json
- name: Install dependencies
run: npm ci
- name: Run tests
run: npm test
build-and-deploy:
name: Build and Deploy Changed Services
needs: [detect-changes, test-api, test-web, test-worker]
if: |
always() &&
(needs.test-api.result == 'success' || needs.test-api.result == 'skipped') &&
(needs.test-web.result == 'success' || needs.test-web.result == 'skipped') &&
(needs.test-worker.result == 'success' || needs.test-worker.result == 'skipped')
runs-on: ubuntu-latest
strategy:
matrix:
service:
- name: api
changed: ${{ needs.detect-changes.outputs.api == 'true' }}
- name: web
changed: ${{ needs.detect-changes.outputs.web == 'true' }}
- name: worker
changed: ${{ needs.detect-changes.outputs.worker == 'true' }}
steps:
- uses: actions/checkout@v4
if: matrix.service.changed == 'true'
- name: Build and push ${{ matrix.service.name }}
if: matrix.service.changed == 'true'
run: |
docker build -t myapp-${{ matrix.service.name }}:${{ github.sha }} \
services/${{ matrix.service.name }}
docker push myapp-${{ matrix.service.name }}:${{ github.sha }}
Example 8: Performance Optimization Pipeline
name: Optimized CI Pipeline
on: [push, pull_request]
jobs:
# Fast feedback jobs run first
quick-checks:
name: Quick Checks (< 2min)
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- name: Setup Node.js
uses: actions/setup-node@v4
with:
node-version: 20
cache: "npm"
- name: Cache node_modules
uses: actions/cache@v4
with:
path: node_modules
key: ${{ runner.os }}-node-${{ hashFiles('**/package-lock.json') }}
restore-keys: |
${{ runner.os }}-node-
- name: Install dependencies
run: npm ci --prefer-offline --no-audit
- name: Parallel lint and type check
run: |
npm run lint &
npm run type-check &
wait
unit-tests-fast:
name: Unit Tests (Changed Files Only)
runs-on: ubuntu-latest
if: github.event_name == 'pull_request'
steps:
- uses: actions/checkout@v4
with:
fetch-depth: 0 # Need full history for changed files
- name: Setup Node.js
uses: actions/setup-node@v4
with:
node-version: 20
cache: "npm"
- name: Install dependencies
run: npm ci --prefer-offline
- name: Get changed files
id: changed-files
run: |
echo "files=$(git diff --name-only origin/main...HEAD | \
grep -E '\.(ts|tsx|js|jsx)$' | \
xargs -I {} echo '--findRelatedTests {}' | \
tr '\n' ' ')" >> $GITHUB_OUTPUT
- name: Run tests for changed files only
if: steps.changed-files.outputs.files != ''
run: npm test -- ${{ steps.changed-files.outputs.files }}
build-with-cache:
name: Build with Aggressive Caching
runs-on: ubuntu-latest
needs: quick-checks
steps:
- uses: actions/checkout@v4
- name: Setup Node.js
uses: actions/setup-node@v4
with:
node-version: 20
cache: "npm"
- name: Cache build output
uses: actions/cache@v4
with:
path: |
.next/cache
dist/
build/
key: ${{ runner.os }}-build-${{ hashFiles('**/*.ts', '**/*.tsx', '**/*.js') }}
restore-keys: |
${{ runner.os }}-build-
- name: Install dependencies
run: npm ci --prefer-offline
- name: Build
run: npm run build
- name: Upload build artifacts
uses: actions/upload-artifact@v4
with:
name: build-output
path: dist/
retention-days: 7
docker-build-optimized:
name: Docker Build with Layer Caching
runs-on: ubuntu-latest
needs: quick-checks
steps:
- uses: actions/checkout@v4
- name: Set up Docker Buildx
uses: docker/setup-buildx-action@v3
- name: Build with cache
uses: docker/build-push-action@v5
with:
context: .
push: false
tags: myapp:${{ github.sha }}
cache-from: type=gha
cache-to: type=gha,mode=max
build-args: |
BUILDKIT_INLINE_CACHE=1
Pipeline Optimization Patterns
Caching Strategy
- Dependency Caching: Cache
node_modules,vendor/,.m2/, etc. - Build Output Caching: Cache compiled artifacts between runs
- Docker Layer Caching: Use BuildKit cache mounts and GitHub Actions cache
- Incremental Builds: Only rebuild changed modules (Nx, Turborepo)
Parallelization Strategies
- Job-Level Parallelization: Run independent jobs concurrently
- Test Sharding: Split test suite across multiple runners
- Matrix Builds: Test multiple versions/platforms simultaneously
- Monorepo Path Filtering: Only test changed services
Conditional Execution
- Path Filters: Skip jobs when irrelevant files change
- Changed Files Detection: Test only affected code
- Branch-Specific Jobs: Different pipelines for different branches
- Manual Triggers: Allow on-demand pipeline execution
ML-Specific Patterns
Model Training Pipeline
- Data Validation: Validate schema and quality before training
- Experiment Tracking: Log metrics to MLflow/W&B
- Model Versioning: Tag models with git SHA or semantic version
- Performance Gates: Enforce minimum accuracy/F1 thresholds
Model Deployment
- A/B Testing: Deploy new model alongside existing
- Shadow Mode: Run new model without affecting production
- Canary Rollout: Gradually increase traffic to new model
- Automated Rollback: Revert on performance degradation
Troubleshooting Guide
Common Issues
- Flaky Tests: Implement retry logic, increase timeouts, fix race conditions
- Slow Pipelines: Profile execution times, add caching, parallelize
- Secrets Exposure: Use secret scanning, audit logs, rotate credentials
- Resource Exhaustion: Set resource limits, use cleanup actions
- Network Timeouts: Add retries, use artifact caching, increase timeouts
Debugging Commands
# GitHub Actions local testing
act -j test --secret-file .env.secrets
# GitLab CI local testing
gitlab-runner exec docker test
# Jenkins pipeline validation
java -jar jenkins-cli.jar declarative-linter < Jenkinsfile
# Docker build debugging
DOCKER_BUILDKIT=1 docker build --progress=plain .
# Test pipeline with dry-run
kubectl apply --dry-run=client -f k8s/
# Validate workflow syntax
actionlint .github/workflows/*.yml
# Supported AI Coding Agents
This skill is compatible with the SKILL.md standard and works with all major AI coding agents:
Learn more about the SKILL.md standard and how to use these skills with your preferred AI coding agent.