pdf-skill

by @404kidwiz in Productivity

# Install this skill:

npx skills add 404kidwiz/claude-supercode-skills --skill "pdf-skill"

Install specific skill from multi-skill repository

# Description

Expert in generating, parsing, and manipulating PDF documents using tools like PDFKit, PDF.js, and Puppeteer. Use when creating PDFs, extracting content, merging documents, or filling forms. Triggers include "PDF", "generate PDF", "parse PDF", "extract PDF", "merge PDF", "PDF form", "PDFKit".

# SKILL.md

name: pdf-skill
description: Expert in generating, parsing, and manipulating PDF documents using tools like PDFKit, PDF.js, and Puppeteer. Use when creating PDFs, extracting content, merging documents, or filling forms. Triggers include "PDF", "generate PDF", "parse PDF", "extract PDF", "merge PDF", "PDF form", "PDFKit".

PDF Skill

Purpose

Provides expertise in programmatic PDF generation, parsing, and manipulation. Specializes in creating PDFs from scratch, extracting content, merging/splitting documents, and handling forms using PDFKit, PDF.js, Puppeteer, and similar tools.

When to Use

Generating PDFs programmatically
Extracting text or data from PDFs
Merging or splitting PDF documents
Filling PDF forms programmatically
Converting HTML to PDF
Adding watermarks or annotations
Parsing PDF structure and metadata
Building PDF report generators

Quick Start

Invoke this skill when:
- Generating PDFs from code or data
- Extracting content from PDF files
- Merging, splitting, or manipulating PDFs
- Filling or creating PDF forms
- Converting HTML/web pages to PDF

Do NOT invoke when:
- Word document creation → use /docx-skill
- Excel/spreadsheet work → use /xlsx-skill
- PowerPoint creation → use /pptx-skill
- General file operations → use Bash or file tools

Decision Framework

PDF Operation?
├── Generate from scratch
│   ├── Simple → PDFKit (Node) / ReportLab (Python)
│   └── Complex layouts → Puppeteer/Playwright + HTML
├── Parse/Extract
│   ├── Text extraction → pdf-parse / PyPDF2
│   └── Table extraction → Camelot / Tabula
├── Manipulate
│   └── pdf-lib (merge, split, edit)
└── Forms
    └── pdf-lib (fill) / PDFtk (advanced)

Core Workflows

1. PDF Generation with PDFKit

Install PDFKit (npm install pdfkit)
Create new PDDocument
Add content (text, images, graphics)
Style with fonts and colors
Add pages as needed
Pipe to file or response

2. HTML to PDF Conversion

Set up Puppeteer/Playwright
Navigate to HTML content or URL
Configure page size and margins
Set print options (headers, footers)
Generate PDF buffer
Save or stream result

3. PDF Parsing and Extraction

Choose parser (pdf-parse, PyPDF2, pdfplumber)
Load PDF file
Extract text or structured data
Handle multi-page documents
Clean and normalize extracted text
Output in desired format

Best Practices

Use vector graphics over raster when possible
Embed fonts for consistent rendering
Test PDF output across different readers
Handle large PDFs with streaming
Use appropriate library for task complexity
Consider accessibility (tagged PDFs)

Anti-Patterns

Anti-Pattern	Problem	Correct Approach
Image-only PDFs	Not searchable/accessible	Use text with fonts
No font embedding	Rendering issues	Embed required fonts
Memory loading large PDFs	Crashes	Stream processing
Ignoring encryption	Security/access issues	Handle encrypted PDFs
Wrong tool for job	Over-engineering	Match tool to complexity

# Supported AI Coding Agents

This skill is compatible with the SKILL.md standard and works with all major AI coding agents:

⚡ Amp 🚀 Antigravity 🤖 Claude Code 🦀 Clawdbot 📝 Codex ▶️ Cursor 🤖 Droid 💎 Gemini CLI 🐙 GitHub Copilot 🪿 Goose 📊 Kilo Code 🔧 Kiro CLI 💻 OpenCode 🦘 Roo Code 🌲 Trae 🏄 Windsurf

Learn more about the SKILL.md standard and how to use these skills with your preferred AI coding agent.