Build or update the BlueBubbles external channel plugin for Moltbot (extension package, REST...
npx skills add Anshin-Health-Solutions/superpai --skill "firecrawl"
Install specific skill from multi-skill repository
# Description
Web scraping and research automation using the Firecrawl API. Converts any URL into LLM-optimized markdown, supports crawling entire sites, full-text search, and site mapping. Integrates with the research skill for deep information gathering.
# SKILL.md
name: firecrawl
description: Web scraping and research automation using the Firecrawl API. Converts any URL into LLM-optimized markdown, supports crawling entire sites, full-text search, and site mapping. Integrates with the research skill for deep information gathering.
triggers:
- /firecrawl
- "scrape this url"
- "crawl this site"
- "firecrawl search"
- "get content from"
- "scrape and summarize"
Firecrawl
Purpose
Firecrawl is the authoritative tool for extracting web content in a form LLMs can reason over. Use it any time you need to read a live webpage, crawl a documentation site, search the web for specific content, or map a site's URL structure. Raw HTML is not acceptable input — always pass Firecrawl markdown to the model.
Installation
# Install the Firecrawl SDK (TypeScript preferred)
bun add @mendable/firecrawl-js
# Or via npm if required
npm install @mendable/firecrawl-js
# Python fallback (only when TypeScript is not an option)
pip install firecrawl-py
Set your API key in the environment before any call:
export FIRECRAWL_API_KEY="fc-xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx"
Obtain your key at https://firecrawl.dev. The free tier supports 500 credits/month. Each scrape costs 1 credit. Crawl jobs cost 1 credit per page.
Core API Patterns
Pattern 1: Single-Page Scrape
Use this when you have a specific URL and need its content.
import FirecrawlApp from "@mendable/firecrawl-js";
const app = new FirecrawlApp({ apiKey: process.env.FIRECRAWL_API_KEY });
const result = await app.scrapeUrl("https://example.com/docs/intro", {
formats: ["markdown"], // Always request markdown, never html
onlyMainContent: true, // Strip nav, footer, sidebar boilerplate
waitFor: 2000, // ms to wait for JS-rendered content
timeout: 30000, // Hard timeout in ms
});
if (result.success) {
console.log(result.markdown); // LLM-ready content
console.log(result.metadata); // title, description, ogImage, etc.
}
Options reference:
- formats: ["markdown"] | ["html"] | ["rawHtml"] | ["screenshot"]
- onlyMainContent: boolean — removes headers, footers, nav elements
- includeTags: ["article", "main"] — include only these HTML tags
- excludeTags: ["nav", "footer", "aside"] — strip these HTML tags
- waitFor: milliseconds to wait for dynamic content
- headers: custom HTTP headers (for auth-gated pages)
Pattern 2: Site Crawl
Use this when you need content from multiple pages of a site (e.g., full documentation).
const crawlJob = await app.crawlUrl("https://docs.example.com", {
limit: 50, // Max pages to crawl
maxDepth: 3, // Link depth from start URL
scrapeOptions: {
formats: ["markdown"],
onlyMainContent: true,
},
allowBackwardLinks: false, // Stay within the subtree
allowExternalLinks: false, // Do not follow external links
});
// crawlJob is async — poll for completion
if (crawlJob.success) {
const results = await app.checkCrawlStatus(crawlJob.id);
for (const page of results.data) {
console.log(page.url, page.markdown);
}
}
Crawl job lifecycle: PENDING -> RUNNING -> COMPLETED | FAILED
Always check results.status === "completed" before consuming data.
Pattern 3: Search
Use this to find pages matching a query without knowing the URL in advance.
const searchResult = await app.search("Claude Code custom keybindings site:docs.anthropic.com", {
limit: 10, // Number of results
lang: "en",
country: "us",
scrapeOptions: {
formats: ["markdown"],
onlyMainContent: true,
},
});
for (const item of searchResult.data) {
console.log(item.url);
console.log(item.markdown); // Full page content, not just snippet
}
Search uses Firecrawl's own index. For real-time results include the current year in your query. Combine with the research skill to rank and synthesize results.
Pattern 4: Site Map
Use this to discover all URLs on a site before deciding what to scrape.
const mapResult = await app.mapUrl("https://docs.example.com", {
search: "authentication", // Optional: filter URLs by keyword
limit: 200,
});
console.log(mapResult.links); // Array of discovered URLs
Map is cheap (1 credit per call regardless of site size). Always map before crawling large sites so you can filter to relevant sections.
Rate Limiting
Firecrawl enforces per-minute rate limits based on your plan:
| Plan | Requests/min | Concurrent |
|---|---|---|
| Free | 10 | 2 |
| Hobby | 60 | 5 |
| Pro | 300 | 20 |
Implement backoff for 429 responses:
async function scrapeWithRetry(url: string, maxRetries = 3): Promise<string> {
for (let attempt = 0; attempt < maxRetries; attempt++) {
const result = await app.scrapeUrl(url, { formats: ["markdown"] });
if (result.success) return result.markdown ?? "";
if (result.error?.includes("429")) {
const delay = Math.pow(2, attempt) * 1000; // 1s, 2s, 4s
await new Promise(r => setTimeout(r, delay));
continue;
}
throw new Error(result.error);
}
throw new Error("Max retries exceeded");
}
Batch Operations
When scraping more than 5 URLs, use batch scrape instead of sequential calls:
const batchResult = await app.batchScrapeUrls(
[
"https://example.com/page-1",
"https://example.com/page-2",
"https://example.com/page-3",
],
{ formats: ["markdown"], onlyMainContent: true }
);
// Poll until done
const status = await app.checkBatchScrapeStatus(batchResult.id);
for (const page of status.data) {
console.log(page.url, page.markdown);
}
Batch scrape runs pages in parallel on Firecrawl's infrastructure, reducing wall-clock time significantly versus sequential scraping.
LLM-Optimized Output Guidelines
Always request markdown, not HTML. Reasons:
- Markdown is 60-80% smaller than equivalent HTML
- Navigation, ads, and boilerplate are stripped
- Code blocks are preserved with language hints
- Tables are converted to markdown table syntax
- Links are preserved in
[text](url)format
Pass the raw result.markdown string directly into your prompt. Do not post-process or summarize before passing to the model — let the model reason over the full content.
For very long pages (>50k tokens), chunk by heading sections:
function chunkByHeadings(markdown: string): string[] {
return markdown.split(/\n(?=#{1,3} )/);
}
Integration with the Research Skill
When invoked alongside /research, Firecrawl serves as the data-collection layer:
- Research skill generates a list of target URLs and search queries
- Firecrawl scrapes and maps those URLs
- Research skill synthesizes the markdown into a structured report
Invocation pattern:
/research topic="Claude Code plugin architecture"
-> internally calls /firecrawl search "Claude Code plugin SKILL.md format"
-> scrapes top 5 results
-> synthesizes findings
Output Format
When this skill completes, output:
FIRECRAWL RESULT
URL: <scraped url>
Pages: <count>
Total characters: <count>
Status: SUCCESS | PARTIAL | FAILED
--- CONTENT ---
<markdown content>
--- END CONTENT ---
For crawl jobs, list each page URL and character count before the combined content block.
Error Handling
| Error Code | Meaning | Action |
|---|---|---|
| 400 | Invalid URL or parameters | Validate URL format, check opts |
| 401 | Invalid API key | Check FIRECRAWL_API_KEY env var |
| 403 | Site blocks scraping | Try with custom headers |
| 404 | Page not found | Verify URL, try site map first |
| 429 | Rate limit exceeded | Exponential backoff |
| 500 | Firecrawl server error | Retry after 5 seconds |
| timeout | JS render took too long | Increase waitFor or use rawHtml |
# Supported AI Coding Agents
This skill is compatible with the SKILL.md standard and works with all major AI coding agents:
Learn more about the SKILL.md standard and how to use these skills with your preferred AI coding agent.