Build or update the BlueBubbles external channel plugin for Moltbot (extension package, REST...
npx skills add 404kidwiz/agent-skills-backup --skill "daily-news-report"
Install specific skill from multi-skill repository
# Description
Scrapes content based on a preset URL list, filters high-quality technical information, and generates daily Markdown reports.
# SKILL.md
name: daily-news-report
description: Scrapes content based on a preset URL list, filters high-quality technical information, and generates daily Markdown reports.
argument-hint: [optional: date]
disable-model-invocation: false
user-invocable: true
allowed-tools: Task, WebFetch, Read, Write, Bash(mkdir), Bash(date), Bash(ls), mcp__chrome-devtools__
Daily News Report v3.0
Architecture Upgrade: Main Agent Orchestration + SubAgent Execution + Browser Scraping + Smart Caching
Core Architecture
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β Main Agent (Orchestrator) β
β Role: Scheduling, Monitoring, Evaluation, Decision, Aggregation β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ€
β β
β βββββββββββββββ βββββββββββββββ βββββββββββββββ βββββββββββββββ β
β β 1. Init β β β 2. Dispatch β β β 3. Monitor β β β 4. Evaluate β β
β β Read Config β β Assign Tasksβ β Collect Res β β Filter/Sort β β
β βββββββββββββββ βββββββββββββββ βββββββββββββββ βββββββββββββββ β
β β β β β β
β βΌ βΌ βΌ βΌ β
β βββββββββββββββ βββββββββββββββ βββββββββββββββ βββββββββββββββ β
β β 5. Decision β β β Enough 20? β β 6. Generate β β β 7. Update β β
β β Cont/Stop β β Y/N β β Report File β β Cache Stats β β
β βββββββββββββββ βββββββββββββββ βββββββββββββββ βββββββββββββββ β
β β
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β Dispatch β Return Results
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β SubAgent Execution Layer β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ€
β β
β βββββββββββββββ βββββββββββββββ βββββββββββββββ β
β β Worker A β β Worker B β β Browser β β
β β (WebFetch) β β (WebFetch) β β (Headless) β β
β β Tier1 Batch β β Tier2 Batch β β JS Render β β
β βββββββββββββββ βββββββββββββββ βββββββββββββββ β
β β β β β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β Structured Result Return β β
β β { status, data: [...], errors: [...], metadata: {...} } β β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
Configuration Files
This skill uses the following configuration files:
| File | Purpose |
|---|---|
sources.json |
Source configuration, priorities, scrape methods |
cache.json |
Cached data, historical stats, deduplication fingerprints |
Execution Process Details
Phase 1: Initialization
Steps:
1. Determine date (user argument or current date)
2. Read sources.json for source configurations
3. Read cache.json for historical data
4. Create output directory NewsReport/
5. Check if a partial report exists for today (append mode)
Phase 2: Dispatch SubAgents
Strategy: Parallel dispatch, batch execution, early stopping mechanism
Wave 1 (Parallel):
- Worker A: Tier1 Batch A (HN, HuggingFace Papers)
- Worker B: Tier1 Batch B (OneUsefulThing, Paul Graham)
Wait for results β Evaluate count
If < 15 high-quality items:
Wave 2 (Parallel):
- Worker C: Tier2 Batch A (James Clear, FS Blog)
- Worker D: Tier2 Batch B (HackerNoon, Scott Young)
If still < 20 items:
Wave 3 (Browser):
- Browser Worker: ProductHunt, Latent Space (Require JS rendering)
Phase 3: SubAgent Task Format
Task format received by each SubAgent:
task: fetch_and_extract
sources:
- id: hn
url: https://news.ycombinator.com
extract: top_10
- id: hf_papers
url: https://huggingface.co/papers
extract: top_voted
output_schema:
items:
- source_id: string # Source Identifier
title: string # Title
summary: string # 2-4 sentence summary
key_points: string[] # Max 3 key points
url: string # Original URL
keywords: string[] # Keywords
quality_score: 1-5 # Quality Score
constraints:
filter: "Cutting-edge Tech/Deep Tech/Productivity/Practical Info"
exclude: "General Science/Marketing Puff/Overly Academic/Job Posts"
max_items_per_source: 10
skip_on_error: true
return_format: JSON
Phase 4: Main Agent Monitoring & Feedback
Main Agent Responsibilities:
Monitoring:
- Check SubAgent return status (success/partial/failed)
- Count collected items
- Record success rate per source
Feedback Loop:
- If a SubAgent fails, decide whether to retry or skip
- If a source fails persistently, mark as disabled
- Dynamically adjust source selection for subsequent batches
Decision:
- Items >= 25 AND HighQuality >= 20 β Stop scraping
- Items < 15 β Continue to next batch
- All batches done but < 20 β Generate with available content (Quality over Quantity)
Phase 5: Evaluation & Filtering
Deduplication:
- Exact URL match
- Title similarity (>80% considered duplicate)
- Check cache.json to avoid history duplicates
Score Calibration:
- Unify scoring standards across SubAgents
- Adjust weights based on source credibility
- Bonus points for manually curated high-quality sources
Sorting:
- Descending order by quality_score
- Sort by source priority if scores are equal
- Take Top 20
Phase 6: Browser Scraping (MCP Chrome DevTools)
For pages requiring JS rendering, use a headless browser:
Process:
1. Call mcp__chrome-devtools__new_page to open page
2. Call mcp__chrome-devtools__wait_for to wait for content load
3. Call mcp__chrome-devtools__take_snapshot to get page structure
4. Parse snapshot to extract required content
5. Call mcp__chrome-devtools__close_page to close page
Applicable Scenarios:
- ProductHunt (403 on WebFetch)
- Latent Space (Substack JS rendering)
- Other SPA applications
Phase 7: Generate Report
Output:
- Directory: NewsReport/
- Filename: YYYY-MM-DD-news-report.md
- Format: Standard Markdown
Content Structure:
- Title + Date
- Statistical Summary (Source count, items collected)
- 20 High-Quality Items (Template based)
- Generation Info (Version, Timestamps)
Phase 8: Update Cache
Update cache.json:
- last_run: Record this run info
- source_stats: Update stats per source
- url_cache: Add processed URLs
- content_hashes: Add content fingerprints
- article_history: Record included articles
SubAgent Call Examples
Using general-purpose Agent
Since custom agents require session restart to be discovered, use general-purpose and inject worker prompts:
Task Call:
subagent_type: general-purpose
model: haiku
prompt: |
You are a stateless execution unit. Only do the assigned task and return structured JSON.
Task: Scrape the following URLs and extract content
URLs:
- https://news.ycombinator.com (Extract Top 10)
- https://huggingface.co/papers (Extract top voted papers)
Output Format:
{
"status": "success" | "partial" | "failed",
"data": [
{
"source_id": "hn",
"title": "...",
"summary": "...",
"key_points": ["...", "...", "..."],
"url": "...",
"keywords": ["...", "..."],
"quality_score": 4
}
],
"errors": [],
"metadata": { "processed": 2, "failed": 0 }
}
Filter Criteria:
- Keep: Cutting-edge Tech/Deep Tech/Productivity/Practical Info
- Exclude: General Science/Marketing Puff/Overly Academic/Job Posts
Return JSON directly, no explanation.
Using worker Agent (Requires session restart)
Task Call:
subagent_type: worker
prompt: |
task: fetch_and_extract
input:
urls:
- https://news.ycombinator.com
- https://huggingface.co/papers
output_schema:
- source_id: string
- title: string
- summary: string
- key_points: string[]
- url: string
- keywords: string[]
- quality_score: 1-5
constraints:
filter: Cutting-edge Tech/Deep Tech/Productivity/Practical Info
exclude: General Science/Marketing Puff/Overly Academic
Output Template
# Daily News Report (YYYY-MM-DD)
> Curated from N sources today, containing 20 high-quality items
> Generation Time: X min | Version: v3.0
>
> **Warning**: Sub-agent 'worker' not detected. Running in generic mode (Serial Execution). Performance might be degraded.
---
## 1. Title
- **Summary**: 2-4 lines overview
- **Key Points**:
1. Point one
2. Point two
3. Point three
- **Source**: [Link](URL)
- **Keywords**: `keyword1` `keyword2` `keyword3`
- **Score**: βββββ (5/5)
---
## 2. Title
...
---
*Generated by Daily News Report v3.0*
*Sources: HN, HuggingFace, OneUsefulThing, ...*
Constraints & Principles
- Quality over Quantity: Low-quality content does not enter the report.
- Early Stop: Stop scraping once 20 high-quality items are reached.
- Parallel First: SubAgents in the same batch execute in parallel.
- Fault Tolerance: Failure of a single source does not affect the whole process.
- Cache Reuse: Avoid re-scraping the same content.
- Main Agent Control: All decisions are made by the Main Agent.
- Fallback Awareness: Detect sub-agent availability, gracefully degrade if unavailable.
Expected Performance
| Scenario | Expected Time | Note |
|---|---|---|
| Optimal | ~2 mins | Tier1 sufficient, no browser needed |
| Normal | ~3-4 mins | Requires Tier2 supplement |
| Browser Needed | ~5-6 mins | Includes JS rendered pages |
Error Handling
| Error Type | Handling |
|---|---|
| SubAgent Timeout | Log error, continue to next |
| Source 403/404 | Mark disabled, update sources.json |
| Extraction Failed | Return raw content, Main Agent decides |
| Browser Crash | Skip source, log entry |
Compatibility & Fallback
To ensure usability across different Agent environments, the following checks must be performed:
-
Environment Check:
- In Phase 1 initialization, attempt to detect if
workersub-agent exists. - If not exists (or plugin not installed), automatically switch to Serial Execution Mode.
- In Phase 1 initialization, attempt to detect if
-
Serial Execution Mode:
- Do not use parallel block.
- Main Agent executes scraping tasks for each source sequentially.
- Slower, but guarantees basic functionality.
-
User Alert:
- MUST include a clear warning in the generated report header indicating the current degraded mode.
# Supported AI Coding Agents
This skill is compatible with the SKILL.md standard and works with all major AI coding agents:
Learn more about the SKILL.md standard and how to use these skills with your preferred AI coding agent.