langchain_patterns

by @DonggangChen in AI & LLM

# Install this skill:

npx skills add DonggangChen/antigravity-agentic-skills --skill "langchain_patterns"

Install specific skill from multi-skill repository

# Description

Implement Retrieval-Augmented Generation (RAG) systems with LangChain4j. Build document ingestion pipelines, embedding stores, vector search strategies, and knowledge-enhanced AI applications. Use when creating question-answering systems over document collections or AI assistants with external knowledge bases.

# SKILL.md

name: langchain_patterns
router_kit: AIKit
description: Implement Retrieval-Augmented Generation (RAG) systems with LangChain4j. Build document ingestion pipelines, embedding stores, vector search strategies, and knowledge-enhanced AI applications. Use when creating question-answering systems over document collections or AI assistants with external knowledge bases.
allowed-tools: Read, Write, Bash
category: ai-development
tags: [agents, algorithms, artificial intelligence, automation, chatbots, cognitive services, deep learning, document-ingestion, embedding, embeddings, frameworks, generative ai, inference, java, langchain patterns, langchain4j, large language models, llm, machine learning, model fine-tuning, natural language processing, neural networks, nlp, openai, prompt engineering, rag, retrieval augmented generation, retrieval-augmented-generation, tools, vector databases, vector-search, workflow automation]
version: 1.1.0
metadata:
skillport:
category: auto-healed
tags:
- prompt
- ai
- llm
- rag
- agent
- embedding
- model
- agent orchestration
- expert
- guide
- context engineering
- expert
- guide
- langchain patterns
- expert
- guide
- langchain_patterns

LangChain4j RAG Implementation Patterns

When to Use This Skill

Use this skill when:
- Building knowledge-based AI applications requiring external document access
- Implementing question-answering systems over large document collections
- Creating AI assistants with access to company knowledge bases
- Building semantic search capabilities for document repositories
- Implementing chat systems that reference specific information sources
- Creating AI applications requiring source attribution
- Building domain-specific AI systems with curated knowledge
- Implementing hybrid search combining vector similarity with traditional search
- Creating AI applications requiring real-time document updates
- Building multi-modal RAG systems with text, images, and other content types

Overview

Implement complete Retrieval-Augmented Generation (RAG) systems with LangChain4j. RAG enhances language models by providing relevant context from external knowledge sources, improving accuracy and reducing hallucinations.

Instructions

Initialize RAG Project

Create a new Spring Boot project with required dependencies:

pom.xml:

<dependency>
    <groupId>dev.langchain4j</groupId>
    <artifactId>langchain4j-spring-boot-starter</artifactId>
    <version>1.8.0</version>
</dependency>
<dependency>
    <groupId>dev.langchain4j</groupId>
    <artifactId>langchain4j-open-ai</artifactId>
    <version>1.8.0</version>
</dependency>

Setup Document Ingestion

Configure document loading and processing:

@Configuration
public class RAGConfiguration {

    @Bean
    public EmbeddingModel embeddingModel() {
        return OpenAiEmbeddingModel.builder()
            .apiKey(System.getenv("OPENAI_API_KEY"))
            .modelName("text-embedding-3-small")
            .build();
    }

    @Bean
    public EmbeddingStore<TextSegment> embeddingStore() {
        return new InMemoryEmbeddingStore<>();
    }
}

Create document ingestion service:

@Service
@RequiredArgsConstructor
public class DocumentIngestionService {

    private final EmbeddingModel embeddingModel;
    private final EmbeddingStore<TextSegment> embeddingStore;

    public void ingestDocument(String filePath, Map<String, Object> metadata) {
        Document document = FileSystemDocumentLoader.loadDocument(filePath);
        document.metadata().putAll(metadata);

        DocumentSplitter splitter = DocumentSplitters.recursive(
            500, 50, new OpenAiTokenCountEstimator("text-embedding-3-small")
        );

        List<TextSegment> segments = splitter.split(document);
        List<Embedding> embeddings = embeddingModel.embedAll(segments).content();
        embeddingStore.addAll(embeddings, segments);
    }
}

Configure Content Retrieval

Setup content retrieval with filtering:

@Configuration
public class ContentRetrieverConfiguration {

    @Bean
    public ContentRetriever contentRetriever(
            EmbeddingStore<TextSegment> embeddingStore,
            EmbeddingModel embeddingModel) {

        return EmbeddingStoreContentRetriever.builder()
            .embeddingStore(embeddingStore)
            .embeddingModel(embeddingModel)
            .maxResults(5)
            .minScore(0.7)
            .build();
    }
}

Create RAG-Enabled AI Service

Define AI service with context retrieval:

interface KnowledgeAssistant {
    @SystemMessage("""
        You are a knowledgeable assistant with access to a comprehensive knowledge base.

        When answering questions:
        1. Use the provided context from the knowledge base
        2. If information is not in the context, clearly state this
        3. Provide accurate, helpful responses
        4. When possible, reference specific sources
        5. If the context is insufficient, ask for clarification
        """)
    String answerQuestion(String question);
}

@Service
@RequiredArgsConstructor
public class KnowledgeService {

    private final KnowledgeAssistant assistant;

    public KnowledgeService(ChatModel chatModel, ContentRetriever contentRetriever) {
        this.assistant = AiServices.builder(KnowledgeAssistant.class)
            .chatModel(chatModel)
            .contentRetriever(contentRetriever)
            .build();
    }

    public String answerQuestion(String question) {
        return assistant.answerQuestion(question);
    }
}

Examples

Basic Document Processing

public class BasicRAGExample {
    public static void main(String[] args) {
        var embeddingStore = new InMemoryEmbeddingStore<TextSegment>();

        var embeddingModel = OpenAiEmbeddingModel.builder()
            .apiKey(System.getenv("OPENAI_API_KEY"))
            .modelName("text-embedding-3-small")
            .build();

        var ingestor = EmbeddingStoreIngestor.builder()
            .embeddingModel(embeddingModel)
            .embeddingStore(embeddingStore)
            .build();

        ingestor.ingest(Document.from("Spring Boot is a framework for building Java applications with minimal configuration."));

        var retriever = EmbeddingStoreContentRetriever.builder()
            .embeddingStore(embeddingStore)
            .embeddingModel(embeddingModel)
            .build();
    }
}

Multi-Domain Assistant

interface MultiDomainAssistant {
    @SystemMessage("""
        You are an expert assistant with access to multiple knowledge domains:
        - Technical documentation
        - Company policies
        - Product information
        - Customer support guides

        Tailor your response based on the type of question and available context.
        Always indicate which domain the information comes from.
        """)
    String answerQuestion(@MemoryId String userId, String question);
}

Hierarchical RAG

@Service
@RequiredArgsConstructor
public class HierarchicalRAGService {

    private final EmbeddingStore<TextSegment> chunkStore;
    private final EmbeddingStore<TextSegment> summaryStore;
    private final EmbeddingModel embeddingModel;

    public String performHierarchicalRetrieval(String query) {
        List<EmbeddingMatch<TextSegment>> summaryMatches = searchSummaries(query);
        List<TextSegment> relevantChunks = new ArrayList<>();

        for (EmbeddingMatch<TextSegment> summaryMatch : summaryMatches) {
            String documentId = summaryMatch.embedded().metadata().getString("documentId");
            List<EmbeddingMatch<TextSegment>> chunkMatches = searchChunksInDocument(query, documentId);
            chunkMatches.stream()
                .map(EmbeddingMatch::embedded)
                .forEach(relevantChunks::add);
        }

        return generateResponseWithChunks(query, relevantChunks);
    }
}

Best Practices

Document Segmentation

Use recursive splitting with 500-1000 token chunks for most applications
Maintain 20-50 token overlap between chunks for context preservation
Consider document structure (headings, paragraphs) when splitting
Use token-aware splitters for optimal embedding generation

Metadata Strategy

Include rich metadata for filtering and attribution:
User and tenant identifiers for multi-tenancy
Document type and category classification
Creation and modification timestamps
Version and author information
Confidentiality and access level tags

Query Processing

Implement query preprocessing and cleaning
Consider query expansion for better recall
Apply dynamic filtering based on user context
Use re-ranking for improved result quality

Performance Optimization

Cache embeddings for repeated queries
Use batch embedding generation for bulk operations
Implement pagination for large result sets
Consider asynchronous processing for long operations

Common Patterns

Simple RAG Pipeline

@RequiredArgsConstructor
@Service
public class SimpleRAGPipeline {

    private final EmbeddingModel embeddingModel;
    private final EmbeddingStore<TextSegment> embeddingStore;
    private final ChatModel chatModel;

    public String answerQuestion(String question) {
        Embedding queryEmbedding = embeddingModel.embed(question).content();
        EmbeddingSearchRequest request = EmbeddingSearchRequest.builder()
            .queryEmbedding(queryEmbedding)
            .maxResults(3)
            .build();

        List<TextSegment> segments = embeddingStore.search(request).matches().stream()
            .map(EmbeddingMatch::embedded)
            .collect(Collectors.toList());

        String context = segments.stream()
            .map(TextSegment::text)
            .collect(Collectors.joining("\n\n"));

        return chatModel.generate(context + "\n\nQuestion: " + question + "\nAnswer:");
    }
}

Hybrid Search (Vector + Keyword)

@Service
@RequiredArgsConstructor
public class HybridSearchService {

    private final EmbeddingStore<TextSegment> vectorStore;
    private final FullTextSearchEngine keywordEngine;
    private final EmbeddingModel embeddingModel;

    public List<Content> hybridSearch(String query, int maxResults) {
        // Vector search
        List<Content> vectorResults = performVectorSearch(query, maxResults);

        // Keyword search
        List<Content> keywordResults = performKeywordSearch(query, maxResults);

        // Combine and re-rank using RRF algorithm
        return combineResults(vectorResults, keywordResults, maxResults);
    }
}

Troubleshooting

Common Issues

Poor Retrieval Results
- Check document chunk size and overlap settings
- Verify embedding model compatibility
- Ensure metadata filters are not too restrictive
- Consider adding re-ranking step

Slow Performance
- Use cached embeddings for frequent queries
- Optimize database indexing for vector stores
- Implement pagination for large datasets
- Consider async processing for bulk operations

High Memory Usage
- Use disk-based embedding stores for large datasets
- Implement proper pagination and filtering
- Clean up unused embeddings periodically
- Monitor and optimize chunk sizes

References

API Reference - Complete API documentation and interfaces
Examples - Production-ready examples and patterns
LangChain Patterns v1.2 - Enhanced

🔄 Workflow

Source: LangChain RAG Docs & Pinecone RAG Guide

Phase 1: Ingestion Pipeline

[ ] Chunking: Split using RecursiveCharacterTextSplitter maintaining semantic integrity (Overlap %10-20).
[ ] Embedding: Choose domain-appropriate model (OpenAI text-embedding-3 or local bge-m3).
[ ] Metadata: Add date, source, author info to vector (For filtering).

Phase 2: Retrieval Strategy

[ ] Hybrid Search: Combine Keyword (BM25) + Vector (Cosine) search (Reciprocal Rank Fusion).
[ ] Re-ranking: Re-rank top 50 results with Cross-Encoder (Cohere/BGE-Reranker).
[ ] Query Transformation: Enrich user question ("this", "that") based on conversation history (HyDE).

Phase 3: Generation & Evaluation

[ ] Prompting: Provide context with "Answer only based on this information" directive.
[ ] Eval: Measure "Context Relevance" and "Answer Faithfulness" with RAGAS or TruLens.

Checkpoints

Phase	Verification
1	Does vector database sync when document is updated?
2	Can it say "I don't know" to irrelevant questions?
3	Which line in the source document is the answer based on (Citation)?

# Supported AI Coding Agents

This skill is compatible with the SKILL.md standard and works with all major AI coding agents:

⚡ Amp 🚀 Antigravity 🤖 Claude Code 🦀 Clawdbot 📝 Codex ▶️ Cursor 🤖 Droid 💎 Gemini CLI 🐙 GitHub Copilot 🪿 Goose 📊 Kilo Code 🔧 Kiro CLI 💻 OpenCode 🦘 Roo Code 🌲 Trae 🏄 Windsurf

Learn more about the SKILL.md standard and how to use these skills with your preferred AI coding agent.