langcache

by @manvinder01 in AI & LLM

# Install this skill:

npx skills add manvinder01/openclaw-langcache --skill "langcache"

Install specific skill from multi-skill repository

# Description

This skill should be used when the user asks to "enable semantic caching", "cache LLM responses", "reduce API costs", "speed up AI responses", "configure LangCache", "search the semantic cache", "store responses in cache", or mentions Redis LangCache, semantic similarity caching, or LLM response caching. Provides integration with Redis LangCache managed service for semantic caching of prompts and responses.

# SKILL.md

name: langcache
description: This skill should be used when the user asks to "enable semantic caching", "cache LLM responses", "reduce API costs", "speed up AI responses", "configure LangCache", "search the semantic cache", "store responses in cache", or mentions Redis LangCache, semantic similarity caching, or LLM response caching. Provides integration with Redis LangCache managed service for semantic caching of prompts and responses.
version: 1.0.0
tools: Read, Bash, WebFetch

Redis LangCache Semantic Caching

This skill integrates Redis LangCache, a fully-managed semantic caching service, into OpenClaw workflows. LangCache stores LLM prompts and responses, returning cached results for semantically similar queries to reduce costs and latency.

Prerequisites

Before using LangCache, ensure the following environment variables are configured:

LANGCACHE_HOST=<your-langcache-host>
LANGCACHE_CACHE_ID=<your-cache-id>
LANGCACHE_API_KEY=<your-api-key>

Store these in ~/.openclaw/secrets.env or configure them in the OpenClaw settings.

Core Operations

Search for Cached Response

Before calling an LLM, check if a semantically similar response exists:

./scripts/langcache.sh search "What is semantic caching?"

With similarity threshold (0.0-1.0, higher = stricter match):

./scripts/langcache.sh search "What is semantic caching?" --threshold 0.95

With attribute filtering:

./scripts/langcache.sh search "What is semantic caching?" --attr "model=gpt-5"

Store New Response

After receiving an LLM response, cache it for future use:

./scripts/langcache.sh store "What is semantic caching?" "Semantic caching stores responses based on meaning similarity..."

With attributes for filtering/organization:

./scripts/langcache.sh store "prompt" "response" --attr "model=gpt-5" --attr "user_id=123"

Delete Cached Entries

By entry ID:

./scripts/langcache.sh delete --id "<entry-id>"

By attributes:

./scripts/langcache.sh delete --attr "user_id=123"

Flush Cache

Clear all entries (use with caution):

./scripts/langcache.sh flush

Integration Pattern

The recommended pattern for integrating LangCache into agent workflows:

1. Receive user prompt
2. Search LangCache for similar cached response
3. If cache hit (similarity >= threshold):
   - Return cached response immediately
   - Log cache hit for observability
4. If cache miss:
   - Call LLM API
   - Store prompt + response in LangCache
   - Return LLM response

Default Caching Policy

This policy is enforced automatically. All cache operations MUST respect these rules.

CACHEABLE (white-list)

Category	Examples	Threshold
Factual Q&A	"What is X?", "How does Y work?"	0.90
Definitions / docs / help text	API docs, command help, explanations	0.90
Command explanations	"What does `git rebase` do?"	0.92
Reusable reply templates	"polite no", "follow-up", "scheduling", "intro"	0.88
Style transforms	"make this warmer/shorter/firmer"	0.85
Generic communication scripts	negotiation templates, professional responses	0.88

NEVER CACHE (hard blocks)

These patterns are blocked at the code level - cache operations will refuse to store them.

Category	Patterns to Detect	Reason
Temporal info	today, tomorrow, this week, deadline, ETA, "in X minutes", appointments, schedules	Stale immediately
Credentials	API keys, tokens, passwords, OTP, 2FA codes, secrets	Security risk
Identifiers	phone numbers, emails, addresses, account IDs, order numbers, message IDs, chat IDs, JIDs	Privacy / PII
Personal context	names + relationships, private history, "who said what", specific conversations	Privacy / context-dependent

Detection Patterns

The following regex patterns trigger a hard block:

# Temporal
\b(today|tomorrow|tonight|yesterday)\b
\b(this|next|last)\s+(week|month|year|monday|tuesday|...)\b
\b(in\s+\d+\s+(minutes?|hours?|days?))\b
\b(deadline|eta|appointment|schedule[d]?)\b

# Credentials
\b(api[_-]?key|token|password|secret|otp|2fa)\b
\b(bearer|auth[orization]*)\s+\S+

# Identifiers
\b\d{10,}\b                          # phone numbers, long IDs
\b[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+   # emails
\b(order|account|message|chat)[_-]?id\b

# Personal context
\b(my\s+(wife|husband|partner|friend|boss|mom|dad|brother|sister))\b
\b(said\s+to\s+me|told\s+me|between\s+us)\b

Attribute Strategies

Use attributes to partition the cache:

model: LLM model used (useful when switching models)
category: factual, template, style, command
skill: Which skill generated the response
version: API or prompt version

Search Strategies

LangCache supports two search strategies:

semantic (default): Vector similarity matching
exact: Case-insensitive exact match

Combine both for hybrid search:

./scripts/langcache.sh search "prompt" --strategy "exact,semantic"

Observability

Monitor cache performance:
- Track hit/miss ratios
- Log similarity scores for hits
- Alert on high miss rates (may indicate threshold too high)
- Review stored entries periodically for relevance

References

API Reference - Complete REST API documentation
Best Practices - Optimization techniques

Examples

examples/basic-caching.sh - Simple cache workflow
examples/agent-integration.py - Python integration pattern

# Supported AI Coding Agents

This skill is compatible with the SKILL.md standard and works with all major AI coding agents:

⚡ Amp 🚀 Antigravity 🤖 Claude Code 🦀 Clawdbot 📝 Codex ▶️ Cursor 🤖 Droid 💎 Gemini CLI 🐙 GitHub Copilot 🪿 Goose 📊 Kilo Code 🔧 Kiro CLI 💻 OpenCode 🦘 Roo Code 🌲 Trae 🏄 Windsurf

Learn more about the SKILL.md standard and how to use these skills with your preferred AI coding agent.