Zaytas

rag-memory

by @Zaytas in Tools
0
0
# Install this skill:
npx skills add Zaytas/openclaw-skill-rag-memory

Or install specific skill: npx add-skill https://github.com/Zaytas/openclaw-skill-rag-memory

# Description

RAG memory system with Qdrant vector search β€” indexes workspace files, session transcripts, and AI-generated summaries for semantic retrieval. Use when OpenClaw needs semantic recall across workspace documents or session history, when setting up or maintaining a Qdrant-backed memory pipeline, or when enriching prompts with retrieved context.

# SKILL.md


name: rag-memory
description: RAG memory system with Qdrant vector search β€” indexes workspace files, session transcripts, and AI-generated summaries for semantic retrieval. Use when OpenClaw needs semantic recall across workspace documents or session history, when setting up or maintaining a Qdrant-backed memory pipeline, or when enriching prompts with retrieved context.


RAG Memory

Use this skill when you need semantic retrieval from OpenClaw workspace files, session transcripts, or structured summaries.

Query workflow

  1. Ensure QDRANT_URL, EMBED_API_KEY (or GEMINI_API_KEY), and any path overrides are set.
  2. Prefer scripts/recall.mjs for operator-facing queries because it combines vector search with a markdown grep fallback.
  3. Use a focused natural-language query first.
  4. Add filters only when needed:
  5. --limit N
  6. --channel discord|signal|webchat
  7. --source-type file|summary|transcript
  8. --no-vector for grep-only fallback
  9. --json for machine-readable output

Example:

node openclaw-rag-memory/scripts/recall.mjs "what changed in the deployment plan" --limit 5 --source-type summary

Indexing workflow

  • Use scripts/index-memory.mjs to embed markdown workspace content.
  • Use scripts/index-transcripts.mjs to embed session transcript chunks.
  • Use scripts/find-unsummarized.mjs and scripts/summarize-worker.mjs to prepare and process summary jobs.
  • Use scripts/embed-summaries.mjs to embed validated summary JSON.
  • Use scripts/enrich-prompt.mjs when a sub-agent prompt should include recalled context.

Resources

All bundled executables live in scripts/.

# README.md

OpenClaw RAG Memory

A portable RAG memory package for OpenClaw that indexes workspace files, session transcripts, and AI-generated session summaries into Qdrant for semantic retrieval.

What it is

This repository packages the memory scripts behind an OpenClaw RAG workflow. It is designed to live inside an OpenClaw workspace and provide searchable long-term context using Qdrant vector search plus Gemini embeddings.

Architecture

The memory index is built from three layers:

  1. File chunks β€” Markdown files from the workspace are chunked, embedded, and stored as sourceType=file.
  2. Session summaries β€” AI-generated structured summaries are embedded as sourceType=summary.
  3. Transcript chunks β€” Session transcripts are chunked directly from OpenClaw session logs and stored as sourceType=transcript.

This gives you both broad semantic recall and precise conversational retrieval.

Prerequisites

  • OpenClaw installed and writing session logs
  • A reachable Qdrant instance
  • A Gemini API key for embeddings
  • Optional: an Anthropic API key if you want to use summarize-sessions.mjs
  • Node.js 20+ with native fetch

Repository layout

openclaw-rag-memory/
β”œβ”€β”€ README.md
β”œβ”€β”€ SKILL.md
β”œβ”€β”€ config.example.mjs
β”œβ”€β”€ LICENSE
β”œβ”€β”€ .gitignore
└── scripts/
    β”œβ”€β”€ recall.mjs
    β”œβ”€β”€ index-memory.mjs
    β”œβ”€β”€ index-transcripts.mjs
    β”œβ”€β”€ embed-summaries.mjs
    β”œβ”€β”€ summarize-worker.mjs
    β”œβ”€β”€ summarize-sessions.mjs
    β”œβ”€β”€ find-unsummarized.mjs
    β”œβ”€β”€ query-memory.mjs
    β”œβ”€β”€ enrich-prompt.mjs
    β”œβ”€β”€ validate-summaries.mjs
    β”œβ”€β”€ nightly-index.sh
    └── nightly-cron-prompt.md

Runtime state, logs, and generated inputs are written under runtime/ and are intentionally ignored by git.

Installation

  1. Copy this repository into your OpenClaw workspace, for example:
    bash cp -R openclaw-rag-memory ~/.openclaw/workspace/
  2. Review config.example.mjs and decide how you want to supply configuration.
  3. Export the required environment variables before running scripts:
    bash export QDRANT_URL=http://localhost:6333 export COLLECTION_NAME=memory export EMBED_MODEL=models/gemini-embedding-001 export EMBED_API_KEY=your_gemini_api_key export WORKSPACE=$HOME/.openclaw/workspace export OPENCLAW_HOME=$HOME/.openclaw
  4. Run the scripts from the workspace or call them by absolute path.

Configuration

The scripts read configuration from environment variables.

Core settings:

  • QDRANT_URL β€” Qdrant base URL. Default: http://localhost:6333
  • COLLECTION_NAME β€” Qdrant collection name. Default: memory
  • EMBED_MODEL β€” embedding model name. Default: models/gemini-embedding-001
  • EMBED_API_KEY β€” Gemini API key for embeddings
  • WORKSPACE β€” OpenClaw workspace path. Defaults to the parent of this repo
  • OPENCLAW_HOME β€” OpenClaw home directory. Defaults to the parent of WORKSPACE
  • AGENT_IDS β€” comma-separated agent ids to scan. Default: main,discord-bot
  • SUMMARY_API_KEY β€” optional summarization API key for summarize-sessions.mjs
  • SUMMARY_MODEL β€” summarization model id

The included config.example.mjs is a documented template showing the full set of tunable values used across the scripts.

Script reference

  • scripts/recall.mjs β€” unified recall helper that combines Qdrant vector search with local markdown grep fallback
  • scripts/index-memory.mjs β€” indexes workspace markdown files into Qdrant
  • scripts/index-transcripts.mjs β€” indexes OpenClaw session transcript chunks into Qdrant
  • scripts/embed-summaries.mjs β€” embeds structured summary JSON into Qdrant
  • scripts/summarize-worker.mjs β€” manages the summarize/validate/embed queue for prepared session inputs
  • scripts/summarize-sessions.mjs β€” legacy end-to-end summarizer that generates and embeds summaries directly
  • scripts/find-unsummarized.mjs β€” discovers eligible sessions and writes compact inputs for summarization
  • scripts/query-memory.mjs β€” CLI semantic query tool for raw Qdrant retrieval
  • scripts/enrich-prompt.mjs β€” augments a task prompt with relevant recalled context
  • scripts/validate-summaries.mjs β€” validates structured summary JSON before embedding
  • scripts/nightly-index.sh β€” deterministic nightly pipeline wrapper
  • scripts/nightly-cron-prompt.md β€” prompt text for a cron/automation agent that should only run deterministic indexing

Typical workflow

Initial indexing

node openclaw-rag-memory/scripts/index-memory.mjs --full
node openclaw-rag-memory/scripts/index-transcripts.mjs --full

Recall

node openclaw-rag-memory/scripts/recall.mjs "release notes policy" --limit 5

Summarization queue

node openclaw-rag-memory/scripts/find-unsummarized.mjs
node openclaw-rag-memory/scripts/summarize-worker.mjs --status

Nightly pipeline setup

Use the deterministic wrapper for unattended indexing:

bash openclaw-rag-memory/scripts/nightly-index.sh

The nightly wrapper:

  1. Indexes workspace markdown files
  2. Indexes transcript chunks
  3. Discovers sessions that need summarization input
  4. Writes a log under runtime/logs/

If you use an external cron or automation agent, pair it with scripts/nightly-cron-prompt.md so summarization remains a separate step.

License

MIT. See LICENSE.

# Supported AI Coding Agents

This skill is compatible with the SKILL.md standard and works with all major AI coding agents:

Learn more about the SKILL.md standard and how to use these skills with your preferred AI coding agent.