cm

Name: cm
Rating: 5 (37 reviews)
Author: Dicklesworthstone

by @Dicklesworthstone in AI & LLM

# Install this skill:

npx skills add Dicklesworthstone/agent_flywheel_clawdbot_skills_and_integrations --skill "cm"

Install specific skill from multi-skill repository

# Description

CASS Memory System - procedural memory for AI coding agents. Three-layer cognitive architecture with confidence decay, anti-pattern learning, cross-agent knowledge transfer, trauma guard safety system. Bun/TypeScript CLI.

# SKILL.md

name: cm
description: "CASS Memory System - procedural memory for AI coding agents. Three-layer cognitive architecture with confidence decay, anti-pattern learning, cross-agent knowledge transfer, trauma guard safety system. Bun/TypeScript CLI."

CM - CASS Memory System

Procedural memory for AI coding agents. Transforms scattered sessions into persistent, cross-agent memory. Uses a three-layer cognitive architecture that mirrors human expertise development.

Why This Exists

AI coding agents accumulate valuable knowledge but it's:
- Trapped in sessions - Context lost when session ends
- Agent-specific - Claude doesn't know what Cursor learned
- Unstructured - Raw logs aren't actionable guidance
- Subject to collapse - Naive summarization loses critical details

You've solved auth bugs three times this month across different agents. Each time you started from scratch.

CM solves this with cross-agent learning: a pattern discovered in Cursor is immediately available to Claude Code.

Three-Layer Cognitive Architecture

┌─────────────────────────────────────────────────────────────────────┐
│                    EPISODIC MEMORY (cass)                           │
│   Raw session logs from all agents — the "ground truth"             │
│   Claude Code │ Codex │ Cursor │ Aider │ PI │ Gemini │ ChatGPT │ ...│
└───────────────────────────┬─────────────────────────────────────────┘
                            │ cass search
                            ▼
┌─────────────────────────────────────────────────────────────────────┐
│                    WORKING MEMORY (Diary)                           │
│   Structured session summaries: accomplishments, decisions, etc.    │
└───────────────────────────┬─────────────────────────────────────────┘
                            │ reflect + curate (automated)
                            ▼
┌─────────────────────────────────────────────────────────────────────┐
│                    PROCEDURAL MEMORY (Playbook)                     │
│   Distilled rules with confidence tracking and decay                │
└─────────────────────────────────────────────────────────────────────┘

Every agent's sessions feed the shared memory. A pattern discovered in Cursor automatically helps Claude Code on the next session.

The One Command You Need

cm context "<your task>" --json

Run this before starting any non-trivial task. Returns:
- relevantBullets - Rules from playbook scored by task relevance
- antiPatterns - Things that have caused problems
- historySnippets - Past sessions (yours and other agents')
- suggestedCassQueries - Deeper investigation searches

Filtering History by Source

historySnippets[].origin.kind is "local" or "remote". Remote hits include origin.host:

{
  "historySnippets": [
    {
      "source_path": "~/.claude/sessions/session-001.jsonl",
      "origin": { "kind": "local" }
    },
    {
      "source_path": "/home/user/.codex/sessions/session.jsonl",
      "origin": { "kind": "remote", "host": "workstation" }
    }
  ]
}

Confidence Decay System

Rules aren't immortal. Confidence decays without revalidation:

Mechanism	Effect
90-day half-life	Confidence halves every 90 days without feedback
4x harmful multiplier	One mistake counts 4× as much as one success
Maturity progression	`candidate` → `established` → `proven`

Score Decay Visualization

Initial score: 10.0 (10 helpful marks today)

After 90 days (half-life):   5.0
After 180 days:              2.5
After 270 days:              1.25
After 365 days:              0.78

Effective Score Formula

effectiveScore = decayedHelpful - (4 × decayedHarmful)

// Where decay factor = 0.5 ^ (daysSinceFeedback / 90)

Maturity State Machine

  ┌──────────┐       ┌─────────────┐    ┌────────┐
  │ candidate│──────▶│ established │───▶│ proven │
  └──────────┘       └─────────────┘    └────────┘
       │                   │                  │
       │                   │ (harmful >25%)   │
       │                   ▼                  │
       │             ┌─────────────┐          │
       └────────────▶│ deprecated  │◀─────────┘
                     └─────────────┘

Transition Rules:

Transition	Criteria
`candidate` → `established`	3+ helpful, harmful ratio <25%
`established` → `proven`	10+ helpful, harmful ratio <10%
`any` → `deprecated`	Harmful ratio >25% OR explicit deprecation

Anti-Pattern Learning

Bad rules don't just get deleted. They become warnings:

"Cache auth tokens for performance"
    ↓ (3 harmful marks)
"PITFALL: Don't cache auth tokens without expiry validation"

When a rule is marked harmful multiple times (>50% harmful ratio with 3+ marks), it's automatically inverted into an anti-pattern.

ACE Pipeline (How Rules Are Created)

Generator → Reflector → Validator → Curator

Stage	Role	LLM?
Generator	Pre-task context hydration (`cm context`)	No
Reflector	Extract patterns from sessions (`cm reflect`)	Yes
Validator	Evidence gate against cass history	Yes
Curator	Deterministic delta merge	No

Critical: Curator has NO LLM to prevent context collapse from iterative drift. LLMs propose patterns; deterministic logic manages them.

Scientific Validation

Before a rule joins your playbook, it's validated against cass history:

Proposed rule: "Always check token expiry before auth debugging"
    ↓
Evidence gate: Search cass for sessions where this applied
    ↓
Result: 5 sessions found, 4 successful outcomes → ACCEPT

Rules without historical evidence are flagged as candidates until proven.

Commands Reference

Context Retrieval (Primary Workflow)

# THE MAIN COMMAND - run before non-trivial tasks
cm context "implement user authentication" --json

# Limit results for token budget
cm context "fix bug" --json --limit 5 --no-history

# With workspace filter
cm context "refactor" --json --workspace /path/to/project

# Self-documenting explanation
cm quickstart --json

# System health
cm doctor --json
cm doctor --fix  # Auto-fix issues

# Find similar rules
cm similar "error handling best practices"

Playbook Management

cm playbook list                              # All rules
cm playbook get b-8f3a2c                      # Rule details
cm playbook add "Always run tests first"      # Add rule
cm playbook add --file rules.json             # Batch add from file
cm playbook add --file rules.json --session /path/session.jsonl  # Track source
cm playbook remove b-xyz --reason "Outdated"  # Remove
cm playbook export > backup.yaml              # Export
cm playbook import shared.yaml                # Import
cm playbook bootstrap react                   # Apply starter to existing

cm top 10                                     # Top effective rules
cm stale --days 60                            # Rules without recent feedback
cm why b-8f3a2c                               # Rule provenance
cm stats --json                               # Playbook health metrics

Learning & Feedback

# Manual feedback
cm mark b-8f3a2c --helpful
cm mark b-xyz789 --harmful --reason "Caused regression"
cm undo b-xyz789                              # Revert feedback

# Session outcomes (positional: status, rules)
cm outcome success b-8f3a2c,b-def456
cm outcome failure b-x7k9p1 --summary "Auth approach failed"
cm outcome-apply                              # Apply to playbook

# Reflection (usually automated)
cm reflect --days 7 --json
cm reflect --session /path/to/session.jsonl   # Single session
cm reflect --workspace /path/to/project       # Project-specific

# Validation
cm validate "Always check null before dereferencing"

# Audit sessions against rules
cm audit --days 30

# Deprecate permanently
cm forget b-xyz789 --reason "Superseded by better pattern"

Onboarding (Agent-Native)

Zero-cost playbook building using your existing agent:

cm onboard status                             # Check progress
cm onboard gaps                               # Category gaps
cm onboard sample --fill-gaps                 # Prioritized sessions
cm onboard sample --agent claude --days 14    # Filter by agent/time
cm onboard sample --workspace /path/project   # Filter by workspace
cm onboard sample --include-processed         # Re-analyze sessions
cm onboard read /path/session.jsonl --template  # Rich context
cm onboard mark-done /path/session.jsonl      # Mark processed
cm onboard reset                              # Start fresh

Trauma Guard (Safety System)

cm trauma list                                # Active patterns
cm trauma add "DROP TABLE" --description "Mass deletion" --severity critical
cm trauma heal t-abc --reason "Intentional migration"
cm trauma remove t-abc
cm trauma scan --days 30                      # Scan for traumas
cm trauma import shared-traumas.yaml

cm guard --install                            # Claude Code hook
cm guard --git                                # Git pre-commit hook
cm guard --install --git                      # Both
cm guard --status                             # Check installation

System Commands

cm init                                       # Initialize
cm init --starter typescript                  # With template
cm init --force                               # Reinitialize (creates backup)
cm starters                                   # List templates
cm serve --port 3001                          # MCP server
cm usage                                      # LLM cost stats
cm privacy status                             # Privacy settings
cm privacy enable                             # Enable cross-agent enrichment
cm privacy disable                            # Disable enrichment
cm project --format agents.md                 # Export for AGENTS.md

Starter Playbooks

Starting with an empty playbook is daunting. Starters provide curated best practices:

cm starters                    # List available
cm init --starter typescript   # Initialize with starter
cm playbook bootstrap react    # Apply to existing playbook

Built-in Starters

Starter	Focus	Rules
general	Universal best practices	5
typescript	TypeScript/Node.js patterns	4
react	React/Next.js development	4
python	Python/FastAPI/Django	4
node	Node.js/Express services	4
rust	Rust service patterns	4

Custom Starters

Create YAML files in ~/.cass-memory/starters/:

# ~/.cass-memory/starters/django.yaml
name: django
description: Django web framework best practices
bullets:
  - content: "Always use Django's ORM for database operations"
    category: database
    maturity: established
    tags: [django, orm]

Inline Feedback (During Work)

Leave feedback in code comments. Parsed during reflection:

// [cass: helpful b-8f3a2c] - this rule saved me from a rabbit hole

// [cass: harmful b-x7k9p1] - this advice was wrong for our use case

Agent Protocol

1. START:    cm context "<task>" --json
2. WORK:     Reference rule IDs when following them (e.g., "Following b-8f3a2c...")
3. FEEDBACK: Leave inline comments when rules help/hurt
4. END:      Just finish. Learning happens automatically.

You do NOT need to:
- Run cm reflect (automation handles this)
- Run cm mark manually (use inline comments)
- Manually add rules to the playbook

Gap Analysis Categories

Category	Keywords
`debugging`	error, fix, bug, trace, stack
`testing`	test, mock, assert, expect, jest
`architecture`	design, pattern, module, abstraction
`workflow`	task, CI/CD, deployment
`documentation`	comment, README, API doc
`integration`	API, HTTP, JSON, endpoint
`collaboration`	review, PR, team
`git`	branch, merge, commit
`security`	auth, token, encrypt, permission
`performance`	optimize, cache, profile

Category Status Thresholds:

Status	Rule Count	Priority
`critical`	0 rules	High
`underrepresented`	1-2 rules	Medium
`adequate`	3-10 rules	Low
`well-covered`	11+ rules	None

Trauma Guard: Safety System

The "hot stove" principle—learn from past incidents and prevent recurrence.

How It Works

Session History              Trauma Registry              Runtime Guard
┌─────────────────┐         ┌─────────────────┐         ┌─────────────────┐
│ rm -rf /* (oops)│ ──────▶ │ Pattern: rm -rf │ ──────▶ │ BLOCKED: This   │
│ "sorry, I made  │  scan   │ Severity: FATAL │  hook   │ command matches │
│  a mistake..."  │         │ Session: abc123 │         │ a trauma pattern│
└─────────────────┘         └─────────────────┘         └─────────────────┘

Built-in Doom Patterns (20+)

Category	Examples
Filesystem	`rm -rf /`, `rm -rf ~`, recursive deletes
Database	`DROP DATABASE`, `TRUNCATE`, `DELETE FROM` without WHERE
Git	`git push --force` to main/master, `git reset --hard`
Infrastructure	`terraform destroy -auto-approve`, `kubectl delete namespace`
Cloud	`aws s3 rm --recursive`, destructive CloudFormation

Pattern Storage

Scope	Location	Purpose
Global	`~/.cass-memory/traumas.jsonl`	Personal patterns
Project	`.cass/traumas.jsonl`	Commit to repo for team

Pattern Lifecycle

Active: Blocks matching commands
Healed: Temporarily bypassed (with reason and timestamp)
Deleted: Removed (can be re-added)

MCP Server

Run as MCP server for agent integration:

# Local-only (recommended)
cm serve --port 3001

# With auth token (for non-loopback)
MCP_HTTP_TOKEN="<random>" cm serve --host 0.0.0.0 --port 3001

Tools Exposed

Tool	Purpose	Parameters
`cm_context`	Get rules + history	`task, limit?, history?, days?, workspace?`
`cm_feedback`	Record feedback	`bulletId, helpful?, harmful?, reason?`
`cm_outcome`	Record session outcome	`sessionId, outcome, rulesUsed?`
`memory_search`	Search playbook/cass	`query, scope?, limit?, days?`
`memory_reflect`	Trigger reflection	`days?, maxSessions?, dryRun?`

Resources Exposed

URI	Purpose
`cm://playbook`	Current playbook state
`cm://diary`	Recent diary entries
`cm://outcomes`	Session outcomes
`cm://stats`	Playbook health metrics

Client Configuration

Claude Code (~/.config/claude/mcp.json):

{
  "mcpServers": {
    "cm": {
      "command": "cm",
      "args": ["serve"]
    }
  }
}

Graceful Degradation

Condition	Behavior
No cass	Playbook-only scoring, no history snippets
No playbook	Empty playbook, commands still work
No LLM	Deterministic reflection, no semantic enhancement
Offline	Cached playbook + local diary

Output Format

All commands support --json for machine-readable output.

Design principle: stdout = JSON only; diagnostics go to stderr.

Success Response

{
  "success": true,
  "task": "fix the auth timeout bug",
  "relevantBullets": [
    {
      "id": "b-8f3a2c",
      "content": "Always check token expiry before auth debugging",
      "effectiveScore": 8.5,
      "maturity": "proven",
      "relevanceScore": 0.92,
      "reasoning": "Extracted from 5 successful sessions"
    }
  ],
  "antiPatterns": [...],
  "historySnippets": [...],
  "suggestedCassQueries": [...],
  "degraded": null
}

Error Response

{
  "success": false,
  "code": "PLAYBOOK_NOT_FOUND",
  "error": "Playbook file not found",
  "hint": "Run 'cm init' to create a new playbook",
  "retryable": false,
  "recovery": ["cm init", "cm doctor --fix"],
  "docs": "README.md#-troubleshooting"
}

Exit Codes

Code	Meaning
1	Internal error
2	User input/usage
3	Configuration
4	Filesystem
5	Network
6	cass error
7	LLM/provider error

Token Budget Management

Flag	Effect
`--limit N`	Cap number of rules
`--min-score N`	Only rules above threshold
`--no-history`	Skip historical snippets (faster)
`--json`	Structured output

Configuration

Config lives at ~/.cass-memory/config.json (global) and .cass/config.json (repo).

Precedence: CLI flags > Repo config > Global config > Defaults

Security: Repo config cannot override sensitive paths or user-level consent settings.

Key Options

{
  "provider": "anthropic",
  "model": "claude-sonnet-4-20250514",
  "budget": {
    "dailyLimit": 0.10,
    "monthlyLimit": 2.00
  },
  "scoring": {
    "decayHalfLifeDays": 90,
    "harmfulMultiplier": 4
  },
  "maxBulletsInContext": 50,
  "maxHistoryInContext": 10,
  "sessionLookbackDays": 7,
  "crossAgent": {
    "enabled": false,
    "consentGiven": false,
    "auditLog": true
  },
  "remoteCass": {
    "enabled": false,
    "hosts": [{"host": "workstation", "label": "work"}]
  },
  "semanticSearchEnabled": false,
  "embeddingModel": "Xenova/all-MiniLM-L6-v2",
  "dedupSimilarityThreshold": 0.85
}

Environment Variables

Variable	Purpose
`ANTHROPIC_API_KEY`	API key for Anthropic (Claude)
`OPENAI_API_KEY`	API key for OpenAI
`GOOGLE_GENERATIVE_AI_API_KEY`	API key for Google Gemini
`CASS_PATH`	Path to cass binary
`CASS_MEMORY_LLM`	Set to `none` for LLM-free mode
`MCP_HTTP_TOKEN`	Auth token for non-loopback MCP server

Data Locations

~/.cass-memory/                  # Global (user-level)
├── config.json                  # Configuration
├── playbook.yaml                # Personal playbook
├── diary/                       # Session summaries
├── outcomes/                    # Session outcomes
├── traumas.jsonl                # Trauma patterns
├── starters/                    # Custom starter playbooks
├── onboarding-state.json        # Onboarding progress
├── privacy-audit.jsonl          # Cross-agent audit trail
├── processed-sessions.jsonl     # Reflection progress
└── usage.jsonl                  # LLM cost tracking

.cass/                           # Project-level (in repo)
├── config.json                  # Project-specific overrides
├── playbook.yaml                # Project-specific rules
├── traumas.jsonl                # Project-specific patterns
└── blocked.yaml                 # Anti-patterns to block

Automating Reflection

Cron Job

# Daily at 2am
0 2 * * * /usr/local/bin/cm reflect --days 7 >> ~/.cass-memory/reflect.log 2>&1

Claude Code Hook

.claude/hooks.json:

{
  "post-session": ["cm reflect --days 1"]
}

Privacy & Security

Local-First Design

All data stays on your machine
No cloud sync, no telemetry
Cross-agent enrichment is opt-in with explicit consent
Audit log for enrichment events

Secret Sanitization

Before processing, content is sanitized:
- OpenAI/Anthropic/AWS/Google API keys
- GitHub tokens
- JWTs
- Passwords and secrets in config patterns

Privacy Controls

cm privacy status    # Check settings
cm privacy enable    # Enable cross-agent enrichment
cm privacy disable   # Disable enrichment

Performance Characteristics

Operation	Typical Latency
`cm context` (cached)	50-150ms
`cm context` (cold)	200-500ms
`cm context` (no cass)	30-80ms
`cm reflect` (1 session)	5-15s
`cm reflect` (5 sessions)	20-60s
`cm playbook list`	<50ms
`cm similar` (keyword)	20-50ms
`cm similar` (semantic)	100-300ms

LLM Cost Estimates

Operation	Typical Cost
Reflect (1 session)	$0.01-0.05
Reflect (7 days)	$0.05-0.20
Validate (1 rule)	$0.005-0.01

With default budget ($0.10/day, $2.00/month): ~5-10 sessions/day.

Batch Rule Addition

After analyzing a session, add multiple rules at once:

# Create JSON file
cat > rules.json << 'EOF'
[
  {"content": "Always run tests before committing", "category": "testing"},
  {"content": "Check token expiry before auth debugging", "category": "debugging"},
  {"content": "AVOID: Mocking entire modules in tests", "category": "testing"}
]
EOF

# Add all rules
cm playbook add --file rules.json

# Track which session they came from
cm playbook add --file rules.json --session /path/to/session.jsonl

# Or pipe from stdin
echo '[{"content": "Rule", "category": "workflow"}]' | cm playbook add --file -

Template Output for Onboarding

--template provides rich context for rule extraction:

cm onboard read /path/to/session.jsonl --template --json

Returns:
- metadata: path, workspace, message count, topic hints
- context: related rules, playbook gaps, suggested focus
- extractionFormat: schema, categories, examples
- sessionContent: actual session data

Integration with CASS

CASS provides episodic memory (raw sessions).
CM extracts procedural memory (rules and playbooks).

# CASS: Search raw sessions
cass search "authentication timeout" --robot

# CM: Get distilled rules for a task
cm context "authentication timeout" --json

Troubleshooting

Error	Solution
`cass not found`	Install from cass repo
`cass search failed`	Run `cass index --full`
`API key missing`	Set `ANTHROPIC_API_KEY`, `OPENAI_API_KEY`, or `GOOGLE_GENERATIVE_AI_API_KEY`
`Playbook corrupt`	Run `cm doctor --fix`
`Budget exceeded`	Check `cm usage`, adjust limits

Diagnostic Commands

cm doctor --json           # System health
cm doctor --fix            # Auto-fix issues
cm usage                   # LLM budget status
cm stats --json            # Playbook health
cm why <bullet-id>         # Rule provenance

LLM-Free Mode

CASS_MEMORY_LLM=none cm context "task" --json

Installation

# One-liner (recommended)
curl -fsSL https://raw.githubusercontent.com/Dicklesworthstone/cass_memory_system/main/install.sh \
  | bash -s -- --easy-mode --verify

# Specific version
install.sh --version v0.2.2 --verify

# System-wide
install.sh --system --verify

# From source
git clone https://github.com/Dicklesworthstone/cass_memory_system.git
cd cass_memory_system
bun install && bun run build
sudo mv ./dist/cass-memory /usr/local/bin/cm

Integration with Flywheel

Tool	Integration
CASS	CM reads from cass episodic memory, writes procedural memory
NTM	Robot mode integrates with cm for context before agent work
Agent Mail	Rules can reference mail threads as provenance
BV	Task context enriched with relevant playbook rules

# Supported AI Coding Agents

This skill is compatible with the SKILL.md standard and works with all major AI coding agents:

⚡ Amp 🚀 Antigravity 🤖 Claude Code 🦀 Clawdbot 📝 Codex ▶️ Cursor 🤖 Droid 💎 Gemini CLI 🐙 GitHub Copilot 🪿 Goose 📊 Kilo Code 🔧 Kiro CLI 💻 OpenCode 🦘 Roo Code 🌲 Trae 🏄 Windsurf

Learn more about the SKILL.md standard and how to use these skills with your preferred AI coding agent.

cm