edxeth

firecrawl

0
0
# Install this skill:
npx skills add edxeth/superlight-firecrawl-skill

Or install specific skill: npx add-skill https://github.com/edxeth/superlight-firecrawl-skill

# Description

Scrapes and crawls web pages, converting them to clean markdown or structured JSON for LLM consumption. Use when needing to extract content from URLs, crawl entire websites, map site structure, search the web with scraping, or extract structured data from pages. Best for web scraping, site crawling, URL discovery, and converting web content to LLM-ready formats.

# SKILL.md


name: firecrawl
description: Scrapes and crawls web pages, converting them to clean markdown or structured JSON for LLM consumption. Use when needing to extract content from URLs, crawl entire websites, map site structure, search the web with scraping, or extract structured data from pages. Best for web scraping, site crawling, URL discovery, and converting web content to LLM-ready formats.


Firecrawl Web Scraping

Converts web pages into clean, LLM-ready markdown or structured data. Handles JavaScript rendering, anti-bot measures, and complex sites.

When to Use

Use Firecrawl when you need to:
- Scrape a specific URL and get its content as markdown/HTML
- Crawl an entire website or section recursively
- Map a website to discover all its URLs
- Search the web AND scrape the results in one operation
- Extract structured JSON data from web pages
- Handle JavaScript-rendered or dynamic content
- Get screenshots of web pages

Protocol

Step 1: Scrape a Single URL

scripts/firecrawl.sh scrape "<url>" [format]

Formats: markdown (default), html, links, screenshot

Example:

scripts/firecrawl.sh scrape "https://docs.firecrawl.dev/introduction"
scripts/firecrawl.sh scrape "https://example.com" "html"

Step 2: Search Web + Scrape Results

scripts/firecrawl.sh search "<query>" [limit]

Example:

scripts/firecrawl.sh search "firecrawl web scraping API" 5

Step 3: Map Website URLs

scripts/firecrawl.sh map "<url>" [limit] [search]

Example:

scripts/firecrawl.sh map "https://firecrawl.dev" 50
scripts/firecrawl.sh map "https://docs.firecrawl.dev" 100 "api reference"

Step 4: Extract Structured JSON (Single Page)

scripts/firecrawl.sh extract "<url>" "<prompt>"

Uses Firecrawl's LLM extraction to return structured JSON from a single page.

Example:

scripts/firecrawl.sh extract "https://firecrawl.dev" "Extract company name, mission, and pricing tiers"

Step 5: Crawl Entire Site

scripts/firecrawl.sh crawl "<url>" [limit] [depth]

Example:

scripts/firecrawl.sh crawl "https://docs.firecrawl.dev" 20 2

Critical Rules

  1. Scrape for single pages - Use scrape when you have specific URLs
  2. Map before crawl - Use map to discover URLs, then scrape specific ones
  3. Search for discovery - Use search to find relevant pages when you don't know URLs
  4. Extract for structure - Use extract when you need JSON, not markdown
  5. Respect rate limits - Script auto-retries on 429 with key rotation
  6. Current year is 2026 - Use this when recency matters; omit for timeless topics or use older years when historically relevant

Resources

See reference/troubleshooting.md for error handling, configuration, and common issues.

# README.md

Superlight Firecrawl Skill

Scrape and crawl web pages via the Firecrawl v2 REST API. A superlight agent skill for AI coding assistants β€” minimal tokens, maximum web data extraction. Supports multiple API keys with round-robin rotation and automatic 429 failover.

Features

  • Web scraping β€” Convert any URL to clean markdown, HTML, or structured JSON
  • Site crawling β€” Recursively crawl entire websites or sections
  • URL discovery β€” Map all URLs on a website instantly
  • Search + scrape β€” Web search with automatic content extraction
  • Structured extraction β€” Extract JSON data from pages using natural language prompts
  • JavaScript rendering β€” Handle dynamic/JS-rendered content
  • Token-efficient β€” Minimal context overhead with progressive disclosure
  • Multi-key rotation β€” Round-robin distribution with automatic 429 failover

Why Use This Over Firecrawl MCP?

Aspect MCP Server This Skill
Context cost ~700+ tokens alwaysΒΉ ~86 tokens always + ~748 on-demand
Tool schemas Always in context None (progressive disclosure)
Setup Requires MCP configuration Drop-in skill directory
Dependencies Node.js runtime bash, curl, jq (Linux/macOS)

ΒΉ Estimated from multi-tool MCP measurements (~14k tokens for 20 tools). Source

Best for: Users who need web scraping on-demand without persistent context overhead.

Token Budget

Uses Claude's progressive disclosure architecture:

Level When Loaded Content Tokens
Metadata Always (startup) Skill description ~86
Instructions When triggered SKILL.md protocol ~748
Resources As needed troubleshooting.md ~817

Token counts measured with claudetokenizer.com (Claude Sonnet 4.5)

Installation

npx skills add edxeth/superlight-firecrawl-skill

The installer will prompt you to select which agents to install to (Claude Code, Cursor, OpenCode, Codex, Antigravity, etc.).

Manual Installation

Clone directly to your agent's skills directory:

# Claude Code
git clone https://github.com/edxeth/superlight-firecrawl-skill.git ~/.claude/skills/firecrawl

# OpenCode
git clone https://github.com/edxeth/superlight-firecrawl-skill.git ~/.opencode/skill/firecrawl

Directory structure:

firecrawl/
β”œβ”€β”€ SKILL.md
β”œβ”€β”€ reference/
β”‚   └── troubleshooting.md
└── scripts/
    └── firecrawl.sh

Usage

The skill triggers automatically when scraping or crawling websites:

"Scrape the content from this URL"
"Crawl the documentation site and extract all pages"
"Map all URLs on this website"
"Extract the pricing information from this page"

Manual Invocation

# Scrape a single URL
./scripts/firecrawl.sh scrape "https://example.com"
./scripts/firecrawl.sh scrape "https://example.com" "html"

# Search web + scrape results
./scripts/firecrawl.sh search "firecrawl web scraping API" 5

# Map website URLs
./scripts/firecrawl.sh map "https://firecrawl.dev" 50
./scripts/firecrawl.sh map "https://docs.firecrawl.dev" 100 "api reference"

# Extract structured data
./scripts/firecrawl.sh extract "https://firecrawl.dev" "Extract pricing tiers"

# Crawl entire site
./scripts/firecrawl.sh crawl "https://docs.firecrawl.dev" 20 2

API Endpoints

Uses Firecrawl v2 REST API:

Endpoint Purpose Rate Limit (Standard)
POST /v2/scrape Scrape single URL 500/min
POST /v2/search Web search + scrape 250/min
POST /v2/map Discover site URLs 500/min
POST /v2/crawl Recursive crawl 50/min

Configuration

API key is required.

# Single API key
export FIRECRAWL_API_KEY="fc-your-key-here"

# Multiple API keys for load distribution
export FIRECRAWL_API_KEY="fc-key1,fc-key2,fc-key3"

When multiple keys are provided (comma-separated), the script rotates through them in round-robin order, ensuring even distribution of requests. If a key hits rate limits (429), the script automatically fails over to the next key and retries, only failing after all keys are exhausted across multiple retry rounds.

Get an API key at firecrawl.dev.

Requirements

  • Platforms: Linux, macOS
  • Dependencies: bash, curl, jq

Skill Metadata

name: firecrawl
description: Scrapes and crawls web pages, converting them to clean markdown or structured JSON for LLM consumption. Use when needing to extract content from URLs, crawl entire websites, map site structure, search the web with scraping, or extract structured data from pages.

License

MIT License

# Supported AI Coding Agents

This skill is compatible with the SKILL.md standard and works with all major AI coding agents:

Learn more about the SKILL.md standard and how to use these skills with your preferred AI coding agent.