pingcap

pytidb

11
5
# Install this skill:
npx skills add pingcap/agent-rules --skill "pytidb"

Install specific skill from multi-skill repository

# Description

PyTiDB (pytidb) setup and usage for TiDB from Python. Covers connecting, table modeling (TableModel), CRUD, raw SQL, transactions, vector/full-text/hybrid search, auto-embedding, custom embedding functions, and reference templates/snippets (vector/hybrid/image) plus agent-oriented examples (RAG/memory/text2sql).

# SKILL.md


name: pytidb
description: PyTiDB (pytidb) setup and usage for TiDB from Python. Covers connecting, table modeling (TableModel), CRUD, raw SQL, transactions, vector/full-text/hybrid search, auto-embedding, custom embedding functions, and reference templates/snippets (vector/hybrid/image) plus agent-oriented examples (RAG/memory/text2sql).
allowed-tools: ["bash", "write", "read_file"]


PyTiDB (pytidb)

Use this skill to connect to TiDB from Python via pytidb, define tables, and build search / AI features on top.

When to Use This Skill

  • You want a Python ORM-like experience on TiDB via pytidb (built on SQLAlchemy).
  • You want vector search / full-text search / hybrid search on TiDB with high-level APIs.
  • You want runnable starter templates (scripts + small examples) you can adapt.

Need to provision a TiDB Cloud cluster first? Use tidbx (TiDB X) for cluster lifecycle guidance.

Code Generation Rules (Python)

  • Never hardcode credentials; use env vars (.env) and document required variables.
  • Prefer python -m venv .venv and pinned deps for reproducibility.
  • When editing requirements.txt, do not invent pytidb versions, use an unpinned pytidb by default unless the user explicitly requests it and the version has been verified to exist.
  • Keep examples minimal and runnable; avoid framework-specific assumptions unless the user asks.
  • Use parameterized SQL for any dynamic value (SQL injection safety).
  • For interactive environments, avoid β€œtable already defined” errors (use extend_existing / open_table / if rows()==0 patterns).

Available Guides

Each guide is a self-contained walkthrough with a checklist and phases:

  • guides/quickstart.md β€” one-file β€œconnect β†’ create table β†’ insert β†’ vector search”
  • guides/search.md β€” vector / full-text / hybrid: when to use which, plus gotchas
  • guides/demos.md β€” examples playbook (vector/hybrid/image)
  • guides/agent-apps.md β€” agent-ish examples (RAG / memory / text2sql)
  • guides/troubleshooting.md β€” connection, TLS, embedding, and index/search issues
  • guides/custom-embedding.md β€” implement a custom embedding function (example: BGE-M3)

I’ll infer your intent (CRUD vs search vs β€œagent app”), then point you to the smallest guide and template set that gets you running.

Templates & Scripts

Each template is a complete file you can copy into your project. Choose the smallest one that matches your goal.

Core usage

  • templates/quickstart.py β€” minimal end-to-end: connect β†’ create table β†’ insert β†’ vector search
  • templates/crud.py β€” basic table modeling + CRUD lifecycle (create/truncate/insert/query/update/delete)
  • templates/auto_embedding.py β€” auto embedding with pluggable providers (env-driven)
  • templates/vector_search.py β€” vector search example (optional metadata filter + threshold)
  • templates/hybrid_search.py β€” hybrid search example (FullTextField + vector field) with fused scoring
  • templates/image_search.py β€” image-to-image or text-to-image search (requires multimodal embedding + Pillow)
  • templates/image_search_data_loader.py β€” loads Oxford Pets dataset into TiDB (used by image_search.py)

Custom embeddings

  • templates/custom_embedding_function.py β€” example BaseEmbeddingFunction implementation (BGE-M3 via FlagEmbedding)
  • templates/custom_embedding.py β€” uses the custom embedder with auto embedding + vector search

Agent-ish examples

  • templates/rag.py β€” minimal RAG: retrieve via vector search, then generate via local LLM (Ollama via LiteLLM)
  • templates/memory_lib.py β€” reusable β€œmemory” library (extract facts β†’ store β†’ retrieve)
  • templates/memory.py β€” CLI memory chat example using memory_lib.py
  • templates/text2sql.py β€” interactive Text2SQL (generates SQL via OpenAI; asks before executing)

Scripts

  • scripts/validate_connection.py β€” quick connection + SELECT 1 smoke test (supports params or DATABASE_URL)
  • tidbx β€” provision/manage TiDB Cloud (TiDB X) clusters

Workflow

I will:
1. Confirm your TiDB deployment (Cloud Starter vs self-managed) and how you want to connect (params vs DATABASE_URL).
2. Help you set env vars, validate the connection, and choose the right path:
- CRUD/table modeling
- vector/full-text/hybrid search (and embedding provider)
- example templates
3. Generate the minimal set of files and commands to get you running.

# Supported AI Coding Agents

This skill is compatible with the SKILL.md standard and works with all major AI coding agents:

Learn more about the SKILL.md standard and how to use these skills with your preferred AI coding agent.