0
0
# Install this skill:
npx skills add Anshin-Health-Solutions/superpai --skill "apify"

Install specific skill from multi-skill repository

# Description

Social media scraping and business data extraction via Apify actors.

# SKILL.md


name: apify
description: "Social media scraping and business data extraction via Apify actors."
triggers:
- Twitter scraping
- Instagram scraping
- LinkedIn scraping
- TikTok scraping
- YouTube scraping
- Google Maps scraping
- Amazon scraping
- Apify


Apify Skill

Data extraction from social media and business platforms using Apify's actor marketplace. Actors are pre-built scrapers that run on Apify infrastructure, returning structured datasets.

Requirements

  • Apify API Token: Set as environment variable APIFY_TOKEN or pass directly in API calls
  • Account Tier Awareness: Free tier provides $5/month compute; monitor usage at console.apify.com

Detailed Process

  1. Identify Target Platform -- Determine which platform the user wants to scrape (Twitter, LinkedIn, Google Maps, etc.).
  2. Select Actor -- Choose the correct actor ID from the table below based on platform and data type.
  3. Configure Run Input -- Build the JSON input payload with search terms, URLs, result limits, and filters.
  4. Execute Actor Run -- POST to the Apify API to start the actor run.
  5. Poll for Completion -- Check run status until it reaches SUCCEEDED or FAILED.
  6. Download Dataset -- Fetch results from the dataset endpoint, handling pagination if needed.
  7. Format Output -- Return structured JSON to the user.

Actor Reference Table

Platform Actor ID Data Type Cost Estimate
Twitter/X apidojo/tweet-scraper Tweets, profiles, followers ~$0.50/1K tweets
LinkedIn anchor/linkedin-people-search People profiles, companies ~$2.00/1K profiles
Google Maps compass/crawler-google-places Business listings, reviews ~$1.00/1K places
Instagram apify/instagram-scraper Posts, profiles, hashtags ~$0.80/1K posts
YouTube bernardo/youtube-scraper Videos, channels, comments ~$0.30/1K videos
Amazon junglee/amazon-crawler Products, reviews, prices ~$1.50/1K products
TikTok clockworks/tiktok-scraper Videos, profiles, hashtags ~$0.60/1K videos

API Invocation Pattern

Start an Actor Run

curl -X POST "https://api.apify.com/v2/acts/{ACTOR_ID}/runs?token=${APIFY_TOKEN}" \
  -H "Content-Type: application/json" \
  -d '{
    "searchTerms": ["query here"],
    "maxItems": 100,
    "proxy": { "useApifyProxy": true }
  }'

Check Run Status

curl "https://api.apify.com/v2/actor-runs/{RUN_ID}?token=${APIFY_TOKEN}"

Download Dataset Results

curl "https://api.apify.com/v2/datasets/{DATASET_ID}/items?token=${APIFY_TOKEN}&format=json&limit=1000&offset=0"

Rate Limiting and Best Practices

  • Concurrency: Free tier allows 1 concurrent run; paid tiers allow more. Do not start multiple runs simultaneously on free tier.
  • Max Items: Always set maxItems to avoid runaway costs. Start with 100, increase only if needed.
  • Proxy Usage: Always set "useApifyProxy": true to avoid IP bans on target platforms.
  • Pagination: Dataset results return max 1000 items per request. Use offset parameter to paginate through larger datasets.
  • Timeouts: Actor runs have a default 1-hour timeout. Set timeoutSecs for shorter runs.

Output Handling

Actor runs produce datasets. Each dataset item is a JSON object with platform-specific fields. Common patterns:

{
  "items": [
    {
      "id": "platform-specific-id",
      "text": "Content text",
      "author": "Username or profile",
      "date": "ISO 8601 timestamp",
      "metrics": { "likes": 42, "shares": 7, "comments": 3 },
      "url": "Direct link to content"
    }
  ],
  "total": 100,
  "offset": 0,
  "limit": 1000
}

Cost Awareness

  • Always check actor pricing before running (visible on actor page)
  • Set maxItems conservatively -- you can always run again for more
  • Monitor usage at https://console.apify.com/billing
  • Free tier resets monthly; paid compute units do not roll over

When to Use

  • User asks to "scrape Twitter", "get LinkedIn profiles", "find Google Maps businesses"
  • Large-scale data collection from social platforms (beyond what WebFetch can handle)
  • Structured data extraction with specific field requirements (metrics, dates, engagement)
  • Recurring data collection tasks that benefit from Apify's scheduling features
  • When direct API access to a platform is unavailable or rate-limited

# Supported AI Coding Agents

This skill is compatible with the SKILL.md standard and works with all major AI coding agents:

Learn more about the SKILL.md standard and how to use these skills with your preferred AI coding agent.