ilyasfit

smart-fetch

0
0
# Install this skill:
npx skills add ilyasfit/smart-fetch

Or install specific skill: npx add-skill https://github.com/ilyasfit/smart-fetch

# Description

>

# SKILL.md


name: smart-fetch
description: >
Fetch URLs and extract only relevant content via intent.
Triggers: URL OR URL + intent, smart-fetch, what does this page say about.


smart-fetch

Fetch URLs and extract only the relevant information. Uses playbooks (free) with Firecrawl fallback, then distills via Gemini 2.5 Flash.

When to Use

  • User provides URL + specific question about its content
  • "smart-fetch this page"
  • "get the relevant parts from..."
  • "what does this documentation say about..."
  • Any URL where you only need specific information, not the full page

When NOT to Use

  • User wants full page content (use WebFetch or firecrawl_scrape)
  • Need structured data extraction (use firecrawl_extract)

Installation

Prerequisites

Install globally

bun install -g github:celebi/agent-web-suite/smart-fetch

Or clone and install

git clone https://github.com/celebi/agent-web-suite.git
cd agent-web-suite/smart-fetch
bun install

Environment Variables

Set these in your shell profile (~/.zshrc or ~/.bashrc):

# Required - for content distillation
export GEMINI_API_KEY="your-gemini-api-key"

# Optional - fallback for JS-heavy pages
export FIRECRAWL_API_KEY="your-firecrawl-api-key"

After adding, restart your terminal or run source ~/.zshrc.

Cursor users: If you launched Cursor from Dock/Spotlight, restart Cursor to pick up new env vars.

Usage

smart-fetch "<url>" "<intent>"

Parameters:

  • [ ] url: The webpage to fetch
  • [ ] intent: What information you need (be specific for best results)

Output: Crystallized markdown to stdout. Status messages go to stderr.

Benchmarking:

smart-fetch --metrics "<url>" "<intent>"

Returns JSON with timing and size metrics.

Examples

# Extract auth setup from API docs
smart-fetch "https://docs.stripe.com/api/authentication" \
  "how to authenticate API requests with examples"

# Get specific function documentation
smart-fetch "https://lodash.com/docs/4.17.15" \
  "debounce function parameters and usage example"

# Extract installation steps
smart-fetch "https://bun.sh/docs/installation" \
  "installation commands for macOS"

Intent Tips

Be specific with intent for best results:

  • Bad: "API docs"
  • Good: "authentication methods and code examples for API key setup"
  • Bad: "how to use"
  • Good: "installation steps and configuration options for macOS"

Follow-Up Fetching

When the distilled response contains links to related pages (e.g., "[see X for details]"),
you can call smart-fetch again on those URLs if deeper context is needed.

Pattern:

  1. First call: Get overview for user's question
  2. Review response β€” are there linked URLs pointing to required details?
  3. If yes β†’ call smart-fetch on the specific link(s)
  4. If no β†’ respond to user

This is expected behavior. Do not hesitate to chain calls when the user's question
requires information from linked pages.

Cost

  • Playbooks fetch: FREE
  • Firecrawl fallback: ~$0.01-0.02 (only when needed)
  • Gemini 2.5 Flash: ~$0.003 per 10KB page

Typical: ~$3/month at 500 fetches.

Troubleshooting

Error Cause Fix
GEMINI_API_KEY not set Env var missing Add to ~/.zshrc, restart terminal/Cursor
FIRECRAWL_API_KEY not set - cannot use fallback JS-heavy page, no fallback configured Add Firecrawl key or use different URL
Sparse content warning Page requires JS rendering Firecrawl fallback will handle it

# README.md

smart-fetch

Fetch URLs and extract only what matters. For AI agents.

57KB page β†’ 10KB. 81% reduction. $0.01 vs $0.07.

Why

AI agents waste tokens on irrelevant web content. The OpenAI chat completions docs are 57KB of markdownβ€”but you only need the messages parameter format.

smart-fetch distills pages to just the content matching your intent:

  • 77% average context reduction
  • 50-84% cheaper than feeding raw markdown to Claude Opus
  • Free fetching via playbooks, Firecrawl fallback for JS-heavy sites
  • Preserves code verbatim xno hallucinated examples

Quick Start

# Install
bun install -g github:celebi/agent-web-suite/smart-fetch

# Use
smart-fetch "https://docs.stripe.com/api/authentication" "how to authenticate with API keys"

Benchmarks

Real results from production documentation sites:

Page Raw Distilled Reduction Time
OpenAI chat completions 56.9KB 10.7KB 81% 11.2s
React useState 30.8KB 5.5KB 82% 7.1s
GitHub REST auth 12.5KB 4.1KB 67% 14.3s
Stripe API auth 10.4KB 1.4KB 86% 5.6s
Bun installation 4.3KB 263B 94% 5.2s

Average: 77% reduction in 7.4s

See BENCHMARK.md for full analysis including cost breakdowns.

Cost

Per-fetch breakdown

Component Cost
playbooks fetch Free
Firecrawl fallback (JS-heavy pages) ~$0.015
Gemini 2.5 Flash distillation ~$0.003 per 10KB

500 pages/month comparison

Based on benchmark averages: 17.3KB raw β†’ 3.5KB distilled (77% reduction)

Without smart-fetch (full markdown β†’ Opus):

500 pages Γ— 4,325 tokens avg = 2.16M input tokens
2.16M Γ— $15/1M (Opus) = $32.44

With smart-fetch (distilled β†’ Opus):

Gemini input  (reads full):     2.16M Γ— $0.30/1M  = $0.65
Gemini output (distilled):      437K  Γ— $2.50/1M  = $1.09
Opus input    (reads distilled): 437K  Γ— $15/1M   = $6.56
                                          Total   = $8.30
Approach Monthly cost Tokens to Opus
Full markdown β†’ Opus $32.44 2.16M
smart-fetch $8.30 437K
Savings $24.14 (74%) 1.72M tokens

The context window savings compound: those 1.72M tokens saved are tokens you're not paying for on every subsequent turn in your conversation.

How It Works

URL + Intent
     ↓
playbooks fetch (free)
     ↓
[sparse content?] β†’ Firecrawl fallback ($0.015)
     ↓
Gemini 2.5 Flash distillation (~$0.003/10KB)
     ↓
Crystallized markdown (stdout)

Installation

Prerequisites

Install

bun install -g github:celebi/agent-web-suite/smart-fetch

Or clone:

git clone https://github.com/celebi/agent-web-suite.git
cd agent-web-suite/smart-fetch
bun install

Environment Variables

# Required - for content distillation
export GEMINI_API_KEY="your-gemini-api-key"

# Optional - fallback for JS-heavy pages
export FIRECRAWL_API_KEY="your-firecrawl-api-key"

Usage

smart-fetch "<url>" "<intent>"

Output goes to stdout. Status messages go to stderr.

Examples

# Extract auth setup from API docs
smart-fetch "https://docs.stripe.com/api/authentication" \
  "how to authenticate API requests with examples"

# Get specific function documentation
smart-fetch "https://react.dev/reference/react/useState" \
  "useState parameters, return value, and usage example"

# Extract installation steps
smart-fetch "https://bun.sh/docs/installation" \
  "installation commands for macOS"

Intent Tips

Specificity drives reduction quality:

Bad Good
"API docs" "authentication methods and code examples for API key setup"
"how to use" "installation steps and configuration options for macOS"

Benchmarking

smart-fetch --metrics "<url>" "<intent>"

Returns JSON with timing, size metrics, and noRelevantContent flag.

Agent Integration

For Cursor or Claude Desktop, see SKILL.md for agent skill configuration that auto-triggers on URL + intent patterns.

Troubleshooting

Error Cause Fix
GEMINI_API_KEY not set Missing env var Add to ~/.zshrc, restart terminal
FIRECRAWL_API_KEY not set - cannot use fallback JS-heavy page, no fallback Add Firecrawl key or try different URL
Sparse content warning Page requires JS rendering Firecrawl fallback handles automatically

License

MIT

# Supported AI Coding Agents

This skill is compatible with the SKILL.md standard and works with all major AI coding agents:

Learn more about the SKILL.md standard and how to use these skills with your preferred AI coding agent.