ilyasfit

smart-fetch

0
0
# Install this skill:
npx skills add ilyasfit/smart-fetch

Or install specific skill: npx add-skill https://github.com/ilyasfit/smart-fetch

# Description

>

# SKILL.md


name: smart-fetch
description: >
Fetch URLs and extract only relevant content via intent.
Triggers: URL OR URL + intent, smart-fetch, what does this page say about.


smart-fetch

Fetch URLs and extract only the relevant information. Uses playbooks (free) with Firecrawl fallback, then distills via Gemini 2.5 Flash.

When to Use

  • User provides URL + specific question about its content
  • "smart-fetch this page"
  • "get the relevant parts from..."
  • "what does this documentation say about..."
  • Any URL where you only need specific information, not the full page

When NOT to Use

  • User wants full page content (use WebFetch or firecrawl_scrape)
  • Need structured data extraction (use firecrawl_extract)

Installation

Prerequisites

Install globally

bun install -g github:celebi/agent-web-suite/smart-fetch

Or clone and install

git clone https://github.com/celebi/agent-web-suite.git
cd agent-web-suite/smart-fetch
bun install

Environment Variables

Set these in your shell profile (~/.zshrc or ~/.bashrc):

# Required - for content distillation
export GEMINI_API_KEY="your-gemini-api-key"

# Optional - fallback for JS-heavy pages
export FIRECRAWL_API_KEY="your-firecrawl-api-key"

After adding, restart your terminal or run source ~/.zshrc.

Cursor users: If you launched Cursor from Dock/Spotlight, restart Cursor to pick up new env vars.

Usage

smart-fetch "<url>" "<intent>"

Parameters:

  • [ ] url: The webpage to fetch
  • [ ] intent: What information you need (be specific for best results)

Output: Crystallized markdown to stdout. Status messages go to stderr.

Benchmarking:

smart-fetch --metrics "<url>" "<intent>"

Returns JSON with timing and size metrics.

Examples

# Extract auth setup from API docs
smart-fetch "https://docs.stripe.com/api/authentication" \
  "how to authenticate API requests with examples"

# Get specific function documentation
smart-fetch "https://lodash.com/docs/4.17.15" \
  "debounce function parameters and usage example"

# Extract installation steps
smart-fetch "https://bun.sh/docs/installation" \
  "installation commands for macOS"

Intent Tips

Be specific with intent for best results:

  • Bad: "API docs"
  • Good: "authentication methods and code examples for API key setup"
  • Bad: "how to use"
  • Good: "installation steps and configuration options for macOS"

Follow-Up Fetching

When the distilled response contains links to related pages (e.g., "[see X for details]"),
you can call smart-fetch again on those URLs if deeper context is needed.

Pattern:

  1. First call: Get overview for user's question
  2. Review response — are there linked URLs pointing to required details?
  3. If yes → call smart-fetch on the specific link(s)
  4. If no → respond to user

This is expected behavior. Do not hesitate to chain calls when the user's question
requires information from linked pages.

Cost

  • Playbooks fetch: FREE
  • Firecrawl fallback: ~$0.01-0.02 (only when needed)
  • Gemini 2.5 Flash: ~$0.003 per 10KB page

Typical: ~$3/month at 500 fetches.

Troubleshooting

Error Cause Fix
GEMINI_API_KEY not set Env var missing Add to ~/.zshrc, restart terminal/Cursor
FIRECRAWL_API_KEY not set - cannot use fallback JS-heavy page, no fallback configured Add Firecrawl key or use different URL
Sparse content warning Page requires JS rendering Firecrawl fallback will handle it

# README.md

smart-fetch

Fetch URLs and extract only what matters. For AI agents.

57KB page → 10KB. 81% reduction. $0.01 vs $0.07.

Why

AI agents waste tokens on irrelevant web content. The OpenAI chat completions docs are 57KB of markdown—but you only need the messages parameter format.

smart-fetch distills pages to just the content matching your intent:

  • 77% average context reduction
  • 50-84% cheaper than feeding raw markdown to Claude Opus
  • Free fetching via playbooks, Firecrawl fallback for JS-heavy sites
  • Preserves code verbatim xno hallucinated examples

Quick Start

# Install
bun install -g github:celebi/agent-web-suite/smart-fetch

# Use
smart-fetch "https://docs.stripe.com/api/authentication" "how to authenticate with API keys"

Benchmarks

Real results from production documentation sites:

Page Raw Distilled Reduction Time
OpenAI chat completions 56.9KB 10.7KB 81% 11.2s
React useState 30.8KB 5.5KB 82% 7.1s
GitHub REST auth 12.5KB 4.1KB 67% 14.3s
Stripe API auth 10.4KB 1.4KB 86% 5.6s
Bun installation 4.3KB 263B 94% 5.2s

Average: 77% reduction in 7.4s

See BENCHMARK.md for full analysis including cost breakdowns.

Cost

Per-fetch breakdown

Component Cost
playbooks fetch Free
Firecrawl fallback (JS-heavy pages) ~$0.015
Gemini 2.5 Flash distillation ~$0.003 per 10KB

500 pages/month comparison

Based on benchmark averages: 17.3KB raw → 3.5KB distilled (77% reduction)

Without smart-fetch (full markdown → Opus):

500 pages × 4,325 tokens avg = 2.16M input tokens
2.16M × $15/1M (Opus) = $32.44

With smart-fetch (distilled → Opus):

Gemini input  (reads full):     2.16M × $0.30/1M  = $0.65
Gemini output (distilled):      437K  × $2.50/1M  = $1.09
Opus input    (reads distilled): 437K  × $15/1M   = $6.56
                                          Total   = $8.30
Approach Monthly cost Tokens to Opus
Full markdown → Opus $32.44 2.16M
smart-fetch $8.30 437K
Savings $24.14 (74%) 1.72M tokens

The context window savings compound: those 1.72M tokens saved are tokens you're not paying for on every subsequent turn in your conversation.

How It Works

URL + Intent
     ↓
playbooks fetch (free)
     ↓
[sparse content?] → Firecrawl fallback ($0.015)
     ↓
Gemini 2.5 Flash distillation (~$0.003/10KB)
     ↓
Crystallized markdown (stdout)

Installation

Prerequisites

Install

bun install -g github:celebi/agent-web-suite/smart-fetch

Or clone:

git clone https://github.com/celebi/agent-web-suite.git
cd agent-web-suite/smart-fetch
bun install

Environment Variables

# Required - for content distillation
export GEMINI_API_KEY="your-gemini-api-key"

# Optional - fallback for JS-heavy pages
export FIRECRAWL_API_KEY="your-firecrawl-api-key"

Usage

smart-fetch "<url>" "<intent>"

Output goes to stdout. Status messages go to stderr.

Examples

# Extract auth setup from API docs
smart-fetch "https://docs.stripe.com/api/authentication" \
  "how to authenticate API requests with examples"

# Get specific function documentation
smart-fetch "https://react.dev/reference/react/useState" \
  "useState parameters, return value, and usage example"

# Extract installation steps
smart-fetch "https://bun.sh/docs/installation" \
  "installation commands for macOS"

Intent Tips

Specificity drives reduction quality:

Bad Good
"API docs" "authentication methods and code examples for API key setup"
"how to use" "installation steps and configuration options for macOS"

Benchmarking

smart-fetch --metrics "<url>" "<intent>"

Returns JSON with timing, size metrics, and noRelevantContent flag.

Agent Integration

For Cursor or Claude Desktop, see SKILL.md for agent skill configuration that auto-triggers on URL + intent patterns.

Troubleshooting

Error Cause Fix
GEMINI_API_KEY not set Missing env var Add to ~/.zshrc, restart terminal
FIRECRAWL_API_KEY not set - cannot use fallback JS-heavy page, no fallback Add Firecrawl key or try different URL
Sparse content warning Page requires JS rendering Firecrawl fallback handles automatically

License

MIT

# Supported AI Coding Agents

This skill is compatible with the SKILL.md standard and works with all major AI coding agents:

Learn more about the SKILL.md standard and how to use these skills with your preferred AI coding agent.