pdf-extract

Name: pdf-extract
Author: terry-li-hm

by @terry-li-hm in Security

# Install this skill:

npx skills add terry-li-hm/skills --skill "pdf-extract"

Install specific skill from multi-skill repository

# Description

Extract text from PDFs with OCR fallback. Use when the user needs PDF text or OCR from scanned documents.

# SKILL.md

name: pdf-extract
description: Extract text from PDFs with OCR fallback. Use when the user needs PDF text or OCR from scanned documents.

PDF Extract

Extract text from PDFs, including large and image-based (scanned) documents.

When to Use

PDF too large to read directly (>20MB)
Image-based/scanned PDFs that need OCR
Salary guides, reports, research papers
Any PDF where standard tools fail

Usage

/pdf-extract <path-or-url> [--local]

Examples:

/pdf-extract /tmp/salary_guide.pdf
/pdf-extract https://example.com/report.pdf
/pdf-extract document.pdf --local   # Force local OCR, skip API

How It Works

Try LlamaParse first (cloud API) — best quality, handles tables well
Try pymupdf4llm (local) — fast, good for text-based PDFs
Fall back to local OCR — PyMuPDF + pytesseract for image-based PDFs
Output to file — saves to /tmp/<filename>.md

Quality Comparison (tested on 46MB salary guide)

Method	Output	Table Quality	Speed
LlamaParse	189K chars	Proper markdown tables	~30s
Local OCR	140K chars	Plain text, some errors	~2min

LlamaParse is significantly better for structured documents with tables.

Implementation

Script: pdf_extract.py in this directory.

uv run ~/skills/pdf-extract/pdf_extract.py <pdf-path-or-url> [output-path] [--local]

API Key

LlamaParse requires an API key. Set the environment variable:

export LLAMA_CLOUD_API_KEY=your-key-here

Without the API key, the script falls back to local extraction (pymupdf4llm → OCR).

Free tier: 1000 pages/day. Get key at https://cloud.llamaindex.ai

Requirements

For local fallback (OCR):
- tesseract: brew install tesseract (macOS) or apt install tesseract-ocr (Ubuntu)

All Python deps handled by uv run inline metadata.

Output

Markdown file at /tmp/<original-filename>.md
Or specify custom output path as second argument

Notes

LlamaParse handles tables, forms, and structured docs very well
Use --local flag to skip API and force local processing
Local OCR is slower but works offline and doesn't use API credits

# Supported AI Coding Agents

This skill is compatible with the SKILL.md standard and works with all major AI coding agents:

⚡ Amp 🚀 Antigravity 🤖 Claude Code 🦀 Clawdbot 📝 Codex ▶️ Cursor 🤖 Droid 💎 Gemini CLI 🐙 GitHub Copilot 🪿 Goose 📊 Kilo Code 🔧 Kiro CLI 💻 OpenCode 🦘 Roo Code 🌲 Trae 🏄 Windsurf

Learn more about the SKILL.md standard and how to use these skills with your preferred AI coding agent.