AmanSikarwar

agent-skills-generator

0
0
# Install this skill:
npx skills add AmanSikarwar/agent-skills-generator

Or install specific skill: npx add-skill https://github.com/AmanSikarwar/agent-skills-generator

# Description

Transform any documentation website into AI-ready skill files. High-performance CLI tool built in Rust for crawling docs and generating structured SKILL.md files optimized for LLM agents.

# README.md

# Agent Skills Generator **Transform any documentation website into AI-ready skill files** [![CI](https://github.com/AmanSikarwar/agent-skills-generator/actions/workflows/ci.yml/badge.svg)](https://github.com/AmanSikarwar/agent-skills-generator/actions/workflows/ci.yml) [![Release](https://github.com/AmanSikarwar/agent-skills-generator/actions/workflows/release.yml/badge.svg)](https://github.com/AmanSikarwar/agent-skills-generator/releases) [![License: MIT](https://img.shields.io/badge/License-MIT-blue.svg)](LICENSE) [![Rust](https://img.shields.io/badge/rust-1.75%2B-orange.svg)](https://www.rust-lang.org/) [Installation](#installation) β€’ [Quick Start](#quick-start) β€’ [Configuration](#configuration) β€’ [Examples](#examples) β€’ [Contributing](#contributing)

What is Agent Skills Generator?

Agent Skills Generator is a high-performance CLI tool that crawls documentation websites and generates structured SKILL.md files optimized for AI agents. These skill files enable LLMs to access up-to-date documentation directly, enhancing their capabilities with domain-specific knowledge.

Key Features

  • Blazing Fast β€” Built in Rust for maximum performance with async crawling
  • Smart Content Extraction β€” Automatically removes navigation, ads, and noise
  • Token Optimized β€” Clean markdown output minimizes token usage
  • Configurable β€” Fine-grained control over crawling behavior with YAML config
  • Polite Crawling β€” Respects robots.txt and implements request delays
  • Cross-Platform β€” Works on Linux, macOS, and Windows

Installation

One-Line Install

Linux / macOS:

curl -fsSL https://raw.githubusercontent.com/AmanSikarwar/agent-skills-generator/master/install.sh | bash

Windows (PowerShell):

iwr -useb https://raw.githubusercontent.com/AmanSikarwar/agent-skills-generator/master/install.ps1 | iex

Using Cargo

If you have Rust installed:

cargo install --git https://github.com/AmanSikarwar/agent-skills-generator

From Source

git clone https://github.com/AmanSikarwar/agent-skills-generator
cd agent-skills-generator
cargo build --release

The binary will be at target/release/agent-skills-generator.

Download Binary

Download pre-built binaries from the Releases page.

Platform Architecture Download
Linux x86_64 Download
Linux ARM64 Download
macOS Intel Download
macOS Apple Silicon Download
Windows x86_64 Download

Quick Start

1. Initialize Configuration

agent-skills-generator init

This launches an interactive wizard that guides you through setting up:
- Target IDE/agent (Cursor, Claude Code, GitHub Copilot, etc.)
- Installation scope (project or user level)
- Crawl settings (delay, depth, concurrency)

The wizard creates a skills.yaml configuration file.

Tip: Use --no-interactive to skip prompts and create a default config:
bash agent-skills-generator init --no-interactive

2. Crawl a Website

agent-skills-generator crawl https://docs.example.com

3. Find Your Skills

Generated skills are saved to .agent/skills/ by default:

.agent/skills/
β”œβ”€β”€ getting-started/
β”‚   └── SKILL.md
β”œβ”€β”€ api-reference/
β”‚   └── SKILL.md
└── tutorials-basics/
    └── SKILL.md

Each SKILL.md contains:

---
name: getting-started
description: Learn how to get started with our platform
metadata:
  url: https://docs.example.com/getting-started
---

# Getting Started

[Full documentation content converted to clean markdown...]

Configuration

Create a skills.yaml file to customize crawling behavior:

# Output directory for generated skills
output: .agent/skills

# Crawl settings
delay_ms: 100           # Delay between requests
max_depth: 25           # Maximum crawl depth
request_timeout_secs: 30
respect_robots_txt: true
subdomains: false
concurrency: 4          # Parallel page processing

# URL filtering rules
rules:
  # Only crawl documentation pages
  - url: "*/docs/*"
    action: allow

  # Ignore authentication pages
  - url: "*/login*"
    action: ignore
  - url: "*/auth/*"
    action: ignore

  # Ignore API internals
  - url: "*/api/internal/*"
    action: ignore

# CSS selectors for elements to remove
remove_selectors:
  - ".advertisement"
  - "#cookie-banner"
  - ".feedback-widget"

Validate Configuration

agent-skills-generator validate --show

Commands

Command Description
crawl <url> Crawl a website and generate skill files
single <url> Process a single URL
clean Remove generated skill files
validate Validate configuration file
init Create configuration (interactive wizard)
init --no-interactive Create default configuration

Common Options

# Verbose output
agent-skills-generator -v crawl https://docs.example.com

# Very verbose (debug)
agent-skills-generator -vv crawl https://docs.example.com

# Custom config file
agent-skills-generator -c my-config.yaml crawl https://docs.example.com

# Custom output directory
agent-skills-generator -o ./my-skills crawl https://docs.example.com

# Limit pages crawled
agent-skills-generator crawl https://docs.example.com --max-pages 50

# Dry run (don't write files)
agent-skills-generator crawl https://docs.example.com --dry-run

Multi-IDE Target Support

Generate skills for specific AI coding assistants:

# Generate for Cursor (outputs to .cursor/skills/)
agent-skills-generator --target cursor crawl https://docs.example.com

# Generate for Claude Code (outputs to .claude/skills/)
agent-skills-generator --target claude-code crawl https://docs.example.com

# Generate for VS Code Copilot (outputs to .github/skills/)
agent-skills-generator --target github-copilot crawl https://docs.example.com

# Install at user level (~/.cursor/skills/)
agent-skills-generator --target cursor --user crawl https://docs.example.com

Supported Targets:

Target Project Directory User Directory
github-copilot .github/skills/ ~/.copilot/skills/
claude-code .claude/skills/ ~/.claude/skills/
cursor .cursor/skills/ ~/.cursor/skills/
antigravity .gemini/skills/ ~/.gemini/skills/
openai-codex .codex/skills/ ~/.codex/skills/
opencode .opencode/skills/ ~/.config/opencode/skills/
custom Uses output field Uses output field

You can also set the target in skills.yaml:

# Target IDE/agent
target: cursor

# Scope: "project" or "user"
scope: project

Examples

Crawl Flutter Documentation

agent-skills-generator crawl https://docs.flutter.dev/ui

Crawl Multiple URLs

agent-skills-generator crawl \
  https://docs.example.com/getting-started \
  https://docs.example.com/tutorials \
  https://docs.example.com/api

Process Single Page

agent-skills-generator single https://docs.example.com/quick-start --stdout

Resume Interrupted Crawl

agent-skills-generator crawl https://docs.example.com --resume

How It Works

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”     β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”     β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚   Web Crawler   │────▢│ Content Cleaner │────▢│ Skill Generator β”‚
β”‚                 β”‚     β”‚                 β”‚     β”‚                 β”‚
β”‚ β€’ Async crawl   β”‚     β”‚ β€’ Remove nav    β”‚     β”‚ β€’ YAML metadata β”‚
β”‚ β€’ Robots.txt    β”‚     β”‚ β€’ Remove ads    β”‚     β”‚ β€’ Clean markdownβ”‚
β”‚ β€’ URL filtering β”‚     β”‚ β€’ Remove noise  β”‚     β”‚ β€’ Organized dirsβ”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜     β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜     β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
  1. Crawl β€” Discovers and fetches pages respecting robots.txt and rate limits
  2. Clean β€” Removes navigation, scripts, styles, ads, and other noise
  3. Convert β€” Transforms HTML to clean, token-efficient markdown
  4. Generate β€” Creates structured SKILL.md files with metadata

Use Cases

AI Agent Enhancement

Give your AI agents access to up-to-date documentation:

# Load skills into your agent's context
skills_dir = ".agent/skills"
for skill in load_skills(skills_dir):
    agent.add_context(skill)

Documentation Indexing

Create searchable documentation archives:

agent-skills-generator crawl https://docs.company.com -o ./docs-archive

Knowledge Base Generation

Build knowledge bases for RAG systems:

agent-skills-generator crawl https://wiki.example.com --max-depth 10

Performance

Agent Skills Generator is built for speed:

Metric Value
Concurrent requests Configurable (default: 4)
Memory usage ~50MB typical
Pages/second 10-50 (network dependent)

Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

  1. Fork the repository
  2. Create your feature branch (git checkout -b feature/amazing-feature)
  3. Commit your changes (git commit -m 'Add amazing feature')
  4. Push to the branch (git push origin feature/amazing-feature)
  5. Open a Pull Request

Development

# Run tests
cargo test

# Run with debug logging
RUST_LOG=debug cargo run -- crawl https://example.com

# Check formatting
cargo fmt --check

# Run clippy
cargo clippy --all-targets --all-features -- -D warnings

License

This project is licensed under the MIT License - see the LICENSE file for details.


**Built with Rust** [Report Bug](https://github.com/AmanSikarwar/agent-skills-generator/issues) β€’ [Request Feature](https://github.com/AmanSikarwar/agent-skills-generator/issues)

# Supported AI Coding Agents

This skill is compatible with the SKILL.md standard and works with all major AI coding agents:

Learn more about the SKILL.md standard and how to use these skills with your preferred AI coding agent.