search

by @cosmix in AI & LLM

# Install this skill:

npx skills add cosmix/loom --skill "search"

Install specific skill from multi-skill repository

# Description

Full-text search and search engine implementation. Use when implementing search functionality, autocomplete, faceted search, relevance tuning, or working with search indexes. Keywords: search, full-text search, Elasticsearch, OpenSearch, Meilisearch, Typesense, fuzzy search, autocomplete, faceted search, facets, inverted index, relevance, ranking, scoring, tokenizer, analyzer, search-as-you-type, aggregations, synonyms, indexing, query, filtering, highlighting, search UI, typeahead, suggestions.

# SKILL.md

name: search
description: Full-text search and search engine implementation. Use when implementing search functionality, autocomplete, faceted search, relevance tuning, or working with search indexes. Keywords: search, full-text search, Elasticsearch, OpenSearch, Meilisearch, Typesense, fuzzy search, autocomplete, faceted search, facets, inverted index, relevance, ranking, scoring, tokenizer, analyzer, search-as-you-type, aggregations, synonyms, indexing, query, filtering, highlighting, search UI, typeahead, suggestions.

Search

Overview

Search functionality is a critical component of modern applications, enabling users to find relevant content quickly. This skill covers Elasticsearch fundamentals, full-text search patterns, indexing strategies, and advanced features like faceted search and autocomplete.

Key Concepts

Elasticsearch Fundamentals

Elasticsearch is a distributed search and analytics engine built on Apache Lucene.

Core Components:

Index: A collection of documents with similar characteristics
Document: A JSON object that is indexed and searchable
Mapping: Schema definition for documents in an index
Shard: A subdivision of an index for horizontal scaling
Replica: Copy of a shard for redundancy and read scaling

Basic Index Operations:

// Create an index with settings
PUT /products
{
  "settings": {
    "number_of_shards": 3,
    "number_of_replicas": 2,
    "analysis": {
      "analyzer": {
        "custom_analyzer": {
          "type": "custom",
          "tokenizer": "standard",
          "filter": ["lowercase", "snowball"]
        }
      }
    }
  },
  "mappings": {
    "properties": {
      "name": { "type": "text", "analyzer": "custom_analyzer" },
      "description": { "type": "text" },
      "price": { "type": "float" },
      "category": { "type": "keyword" },
      "created_at": { "type": "date" }
    }
  }
}

Full-Text Search Patterns

Match Query - Standard full-text search:

GET /products/_search
{
  "query": {
    "match": {
      "description": {
        "query": "wireless bluetooth headphones",
        "operator": "and",
        "fuzziness": "AUTO"
      }
    }
  }
}

Multi-Match Query - Search across multiple fields:

GET /products/_search
{
  "query": {
    "multi_match": {
      "query": "wireless headphones",
      "fields": ["name^3", "description", "category^2"],
      "type": "best_fields",
      "tie_breaker": 0.3
    }
  }
}

Bool Query - Combine multiple conditions:

GET /products/_search
{
  "query": {
    "bool": {
      "must": [
        { "match": { "name": "headphones" } }
      ],
      "filter": [
        { "range": { "price": { "gte": 50, "lte": 200 } } },
        { "term": { "category": "electronics" } }
      ],
      "should": [
        { "match": { "description": "noise cancelling" } }
      ],
      "must_not": [
        { "term": { "status": "discontinued" } }
      ]
    }
  }
}

Indexing Strategies

Bulk Indexing:

POST /_bulk
{ "index": { "_index": "products", "_id": "1" } }
{ "name": "Wireless Headphones", "price": 99.99 }
{ "index": { "_index": "products", "_id": "2" } }
{ "name": "Bluetooth Speaker", "price": 49.99 }

Index Aliases - Zero-downtime reindexing:

// Create alias
POST /_aliases
{
  "actions": [
    { "add": { "index": "products_v2", "alias": "products" } },
    { "remove": { "index": "products_v1", "alias": "products" } }
  ]
}

Relevance Tuning and Boosting

Field Boosting:

GET /products/_search
{
  "query": {
    "multi_match": {
      "query": "headphones",
      "fields": ["name^5", "description^2", "tags"]
    }
  }
}

Function Score - Custom scoring:

GET /products/_search
{
  "query": {
    "function_score": {
      "query": { "match": { "name": "headphones" } },
      "functions": [
        {
          "filter": { "term": { "featured": true } },
          "weight": 2
        },
        {
          "field_value_factor": {
            "field": "popularity",
            "factor": 1.2,
            "modifier": "sqrt"
          }
        },
        {
          "gauss": {
            "created_at": {
              "origin": "now",
              "scale": "30d",
              "decay": 0.5
            }
          }
        }
      ],
      "score_mode": "multiply",
      "boost_mode": "multiply"
    }
  }
}

Faceted Search and Aggregations

Terms Aggregation - Category facets:

GET /products/_search
{
  "size": 0,
  "aggs": {
    "categories": {
      "terms": { "field": "category", "size": 10 }
    },
    "price_ranges": {
      "range": {
        "field": "price",
        "ranges": [
          { "to": 50, "key": "budget" },
          { "from": 50, "to": 100, "key": "mid-range" },
          { "from": 100, "key": "premium" }
        ]
      }
    },
    "avg_price": {
      "avg": { "field": "price" }
    }
  }
}

Nested Aggregations:

GET /products/_search
{
  "aggs": {
    "categories": {
      "terms": { "field": "category" },
      "aggs": {
        "avg_price": { "avg": { "field": "price" } },
        "top_products": {
          "top_hits": { "size": 3, "_source": ["name", "price"] }
        }
      }
    }
  }
}

Search-as-You-Type and Autocomplete

Completion Suggester Setup:

PUT /products
{
  "mappings": {
    "properties": {
      "name_suggest": {
        "type": "completion",
        "contexts": [
          { "name": "category", "type": "category" }
        ]
      }
    }
  }
}

Autocomplete Query:

GET /products/_search
{
  "suggest": {
    "product_suggest": {
      "prefix": "wire",
      "completion": {
        "field": "name_suggest",
        "size": 5,
        "fuzzy": { "fuzziness": 1 },
        "contexts": {
          "category": ["electronics"]
        }
      }
    }
  }
}

Edge N-gram Analyzer - Alternative approach:

PUT /products
{
  "settings": {
    "analysis": {
      "filter": {
        "edge_ngram_filter": {
          "type": "edge_ngram",
          "min_gram": 2,
          "max_gram": 20
        }
      },
      "analyzer": {
        "autocomplete": {
          "type": "custom",
          "tokenizer": "standard",
          "filter": ["lowercase", "edge_ngram_filter"]
        },
        "autocomplete_search": {
          "type": "custom",
          "tokenizer": "standard",
          "filter": ["lowercase"]
        }
      }
    }
  },
  "mappings": {
    "properties": {
      "name": {
        "type": "text",
        "analyzer": "autocomplete",
        "search_analyzer": "autocomplete_search"
      }
    }
  }
}

Synonyms and Analyzers

Synonym Configuration:

PUT /products
{
  "settings": {
    "analysis": {
      "filter": {
        "synonym_filter": {
          "type": "synonym",
          "synonyms": [
            "laptop, notebook, portable computer",
            "phone, mobile, cellphone, smartphone",
            "tv, television, telly"
          ]
        },
        "synonym_graph_filter": {
          "type": "synonym_graph",
          "synonyms_path": "synonyms.txt"
        }
      },
      "analyzer": {
        "synonym_analyzer": {
          "tokenizer": "standard",
          "filter": ["lowercase", "synonym_filter"]
        }
      }
    }
  }
}

Custom Analyzer with Multiple Filters:

PUT /products
{
  "settings": {
    "analysis": {
      "char_filter": {
        "html_strip": { "type": "html_strip" }
      },
      "filter": {
        "english_stop": { "type": "stop", "stopwords": "_english_" },
        "english_stemmer": { "type": "stemmer", "language": "english" }
      },
      "analyzer": {
        "english_analyzer": {
          "type": "custom",
          "char_filter": ["html_strip"],
          "tokenizer": "standard",
          "filter": ["lowercase", "english_stop", "english_stemmer"]
        }
      }
    }
  }
}

Elasticsearch Patterns

Connection Management:

// Singleton client pattern
class ElasticsearchClient {
  static instance = null;

  static getInstance() {
    if (!this.instance) {
      this.instance = new Client({
        node: process.env.ES_URL,
        auth: {
          apiKey: process.env.ES_API_KEY,
        },
        maxRetries: 3,
        requestTimeout: 30000,
      });
    }
    return this.instance;
  }
}

Index Templates - Consistent mappings across time-series indices:

PUT /_index_template/logs_template
{
  "index_patterns": ["logs-*"],
  "template": {
    "settings": {
      "number_of_shards": 1,
      "number_of_replicas": 1
    },
    "mappings": {
      "properties": {
        "@timestamp": { "type": "date" },
        "message": { "type": "text" },
        "level": { "type": "keyword" }
      }
    }
  }
}

Reindexing Pattern - Schema migrations:

POST /_reindex
{
  "source": { "index": "products_v1" },
  "dest": { "index": "products_v2" },
  "script": {
    "source": "ctx._source.category = ctx._source.category.toLowerCase()"
  }
}

Search UI Patterns

Debounced Search Input:

import { useState, useEffect } from 'react';

function SearchBar({ onSearch }) {
  const [query, setQuery] = useState('');

  useEffect(() => {
    const timer = setTimeout(() => {
      if (query.length >= 2) {
        onSearch(query);
      }
    }, 300);

    return () => clearTimeout(timer);
  }, [query, onSearch]);

  return (
    <input
      type="text"
      value={query}
      onChange={(e) => setQuery(e.target.value)}
      placeholder="Search..."
    />
  );
}

Faceted Search Component:

function FacetedSearch({ aggregations, selectedFilters, onFilterChange }) {
  return (
    <div className="facets">
      <div className="facet-group">
        <h3>Category</h3>
        {aggregations.categories.buckets.map(bucket => (
          <label key={bucket.key}>
            <input
              type="checkbox"
              checked={selectedFilters.category?.includes(bucket.key)}
              onChange={() => onFilterChange('category', bucket.key)}
            />
            {bucket.key} ({bucket.doc_count})
          </label>
        ))}
      </div>

      <div className="facet-group">
        <h3>Price Range</h3>
        {aggregations.price_ranges.buckets.map(bucket => (
          <label key={bucket.key}>
            <input
              type="radio"
              name="price_range"
              checked={selectedFilters.priceRange === bucket.key}
              onChange={() => onFilterChange('priceRange', bucket.key)}
            />
            {bucket.key} ({bucket.doc_count})
          </label>
        ))}
      </div>
    </div>
  );
}

Search Results with Highlighting:

function SearchResult({ hit }) {
  const getHighlightedText = (text, highlights) => {
    if (!highlights) return text;
    return { __html: highlights.join('...') };
  };

  return (
    <div className="search-result">
      <h3
        dangerouslySetInnerHTML={
          getHighlightedText(hit.name, hit.highlight?.name)
        }
      />
      <p
        dangerouslySetInnerHTML={
          getHighlightedText(hit.description, hit.highlight?.description)
        }
      />
      <span className="score">Score: {hit._score.toFixed(2)}</span>
    </div>
  );
}

Relevance Tuning Strategies

Testing Relevance:

class RelevanceTest {
  async testQuery(query, expectedTopResults) {
    const results = await this.search(query);
    const topIds = results.hits.slice(0, 3).map(h => h._id);

    console.log(`Query: "${query}"`);
    console.log(`Expected: ${expectedTopResults.join(', ')}`);
    console.log(`Actual: ${topIds.join(', ')}`);

    const precision = topIds.filter(id =>
      expectedTopResults.includes(id)
    ).length / topIds.length;

    return { precision, topIds };
  }
}

// Test cases
const tests = [
  { query: 'wireless headphones', expected: ['prod-123', 'prod-456'] },
  { query: 'bluetooth speaker', expected: ['prod-789', 'prod-012'] },
];

Multi-Field Scoring Strategy:

GET /products/_search
{
  "query": {
    "multi_match": {
      "query": "wireless headphones",
      "fields": [
        "exact_name^10",
        "name^5",
        "brand^3",
        "description^2",
        "tags"
      ],
      "type": "cross_fields",
      "operator": "and"
    }
  }
}

Recency Boosting Pattern:

GET /articles/_search
{
  "query": {
    "function_score": {
      "query": { "match": { "content": "elasticsearch" } },
      "functions": [
        {
          "exp": {
            "published_at": {
              "origin": "now",
              "scale": "7d",
              "offset": "1d",
              "decay": 0.5
            }
          }
        }
      ]
    }
  }
}

Popularity + Relevance Combination:

GET /products/_search
{
  "query": {
    "function_score": {
      "query": { "match": { "name": "laptop" } },
      "functions": [
        {
          "field_value_factor": {
            "field": "sales_count",
            "modifier": "log1p",
            "factor": 0.1
          }
        },
        {
          "field_value_factor": {
            "field": "rating",
            "modifier": "none",
            "factor": 2
          }
        }
      ],
      "score_mode": "sum",
      "boost_mode": "multiply"
    }
  }
}

Best Practices

Indexing

Use bulk operations for large data imports
Implement index aliases for zero-downtime reindexing
Choose appropriate shard count based on data size
Use explicit mappings instead of dynamic mapping in production

Query Performance

Use filter context for exact matches (cached, faster)
Use must context only when scoring matters
Limit result size and use pagination
Avoid leading wildcards in queries

Relevance

Test relevance with representative queries
Use field boosting to prioritize important fields
Implement function_score for business logic (popularity, recency)
Consider using dis_max for OR-style queries

Autocomplete

Use completion suggester for simple prefix matching
Use edge n-grams for more flexible matching
Implement debouncing on the client side (200-300ms)
Return suggestions with highlighting

Schema Design

Use keyword type for exact matches and aggregations
Use text type for full-text search
Consider multi-fields for both use cases
Use nested objects sparingly (performance impact)

Examples

Complete Search Implementation (Node.js)

const { Client } = require("@elastic/elasticsearch");

class SearchService {
  constructor() {
    this.client = new Client({ node: "http://localhost:9200" });
  }

  async search(query, filters = {}, page = 1, pageSize = 20) {
    const must = [];
    const filter = [];

    if (query) {
      must.push({
        multi_match: {
          query,
          fields: ["name^3", "description", "tags^2"],
          type: "best_fields",
          fuzziness: "AUTO",
        },
      });
    }

    if (filters.category) {
      filter.push({ term: { category: filters.category } });
    }

    if (filters.priceMin || filters.priceMax) {
      filter.push({
        range: {
          price: {
            ...(filters.priceMin && { gte: filters.priceMin }),
            ...(filters.priceMax && { lte: filters.priceMax }),
          },
        },
      });
    }

    const response = await this.client.search({
      index: "products",
      body: {
        from: (page - 1) * pageSize,
        size: pageSize,
        query: {
          bool: {
            must: must.length ? must : [{ match_all: {} }],
            filter,
          },
        },
        aggs: {
          categories: { terms: { field: "category", size: 20 } },
          price_stats: { stats: { field: "price" } },
        },
        highlight: {
          fields: {
            name: {},
            description: { fragment_size: 150 },
          },
        },
      },
    });

    return {
      hits: response.hits.hits.map((hit) => ({
        ...hit._source,
        _score: hit._score,
        highlight: hit.highlight,
      })),
      total: response.hits.total.value,
      aggregations: response.aggregations,
    };
  }

  async autocomplete(prefix, limit = 5) {
    const response = await this.client.search({
      index: "products",
      body: {
        suggest: {
          suggestions: {
            prefix,
            completion: {
              field: "name_suggest",
              size: limit,
              fuzzy: { fuzziness: 1 },
            },
          },
        },
      },
    });

    return response.suggest.suggestions[0].options.map((opt) => ({
      text: opt.text,
      score: opt._score,
    }));
  }
}

Python Implementation

from elasticsearch import Elasticsearch, helpers
from typing import Dict, List, Optional

class SearchService:
    def __init__(self, hosts: List[str] = ['localhost:9200']):
        self.es = Elasticsearch(hosts)

    def bulk_index(self, index: str, documents: List[Dict]):
        actions = [
            {
                '_index': index,
                '_id': doc.get('id'),
                '_source': doc
            }
            for doc in documents
        ]
        helpers.bulk(self.es, actions)

    def search(
        self,
        index: str,
        query: str,
        filters: Optional[Dict] = None,
        page: int = 1,
        size: int = 20
    ) -> Dict:
        body = {
            'from': (page - 1) * size,
            'size': size,
            'query': {
                'bool': {
                    'must': [{
                        'multi_match': {
                            'query': query,
                            'fields': ['name^3', 'description'],
                            'fuzziness': 'AUTO'
                        }
                    }] if query else [{'match_all': {}}],
                    'filter': self._build_filters(filters or {})
                }
            },
            'aggs': {
                'categories': {'terms': {'field': 'category'}},
                'price_ranges': {
                    'range': {
                        'field': 'price',
                        'ranges': [
                            {'to': 50},
                            {'from': 50, 'to': 100},
                            {'from': 100}
                        ]
                    }
                }
            }
        }

        return self.es.search(index=index, body=body)

    def _build_filters(self, filters: Dict) -> List[Dict]:
        result = []
        if 'category' in filters:
            result.append({'term': {'category': filters['category']}})
        if 'price_min' in filters or 'price_max' in filters:
            price_range = {}
            if 'price_min' in filters:
                price_range['gte'] = filters['price_min']
            if 'price_max' in filters:
                price_range['lte'] = filters['price_max']
            result.append({'range': {'price': price_range}})
        return result

# Supported AI Coding Agents

This skill is compatible with the SKILL.md standard and works with all major AI coding agents:

⚡ Amp 🚀 Antigravity 🤖 Claude Code 🦀 Clawdbot 📝 Codex ▶️ Cursor 🤖 Droid 💎 Gemini CLI 🐙 GitHub Copilot 🪿 Goose 📊 Kilo Code 🔧 Kiro CLI 💻 OpenCode 🦘 Roo Code 🌲 Trae 🏄 Windsurf

Learn more about the SKILL.md standard and how to use these skills with your preferred AI coding agent.