oberon-mini

threads-archiver

0
0
# Install this skill:
npx skills add oberon-mini/oberon-skills --skill "threads-archiver"

Install specific skill from multi-skill repository

# Description

Archive and maintain Meta Threads post history into structured Markdown (index + per-post records) with stable URLs, dates, and content. Use when asked to scrape/export a Threads profile, backfill missing posts, deduplicate archives, convert raw post data to note format, or update a knowledge base (for example Obsidian) with chronological Threads records.

# SKILL.md


name: threads-archiver
description: Archive and maintain Meta Threads post history into structured Markdown (index + per-post records) with stable URLs, dates, and content. Use when asked to scrape/export a Threads profile, backfill missing posts, deduplicate archives, convert raw post data to note format, or update a knowledge base (for example Obsidian) with chronological Threads records.


Threads Archiver

Archive Threads data in a repeatable, reviewable format.

Core Workflow

  1. Identify archive scope.
  2. Collect post data.
  3. Normalize records.
  4. Merge and deduplicate.
  5. Render/update markdown output.
  6. Validate ordering and coverage.

1) Identify archive scope

Capture these inputs before processing:

  • Target account handle (for example @macau.drive.exam)
  • Output note path
  • Mode: full, incremental, or repair
  • Required fields: date, url, content (minimum)

If scope is unclear, ask one short clarification question, then proceed.

2) Collect post data

Prefer one of these sources:

  • Existing exported list (CSV/JSON/JSONL)
  • Previously archived markdown note
  • Browser-captured rows copied by the user

When using browser capture, gather at least:

  • Canonical post URL
  • Post date/time (or best available date)
  • Full visible text content (or best-effort text)

3) Normalize records

Normalize each record into this logical schema:

  • date: ISO date (YYYY-MM-DD) when possible
  • url: canonical Threads URL
  • content: post text (trimmed, preserve line breaks where meaningful)

Use scripts/build_threads_archive.py to normalize CSV/JSON/JSONL inputs and emit markdown-ready entries.

4) Merge and deduplicate

Deduplicate by URL first.

If URL is missing, use fallback key:

  • date + first 80 chars of content

When duplicates conflict:

  • Keep the record with richer content (longer non-whitespace text)
  • Keep canonical URL variant

5) Render/update markdown output

Use the format in references/archive-format.md.

Default output sections:

  • Metadata header (account + generated time)
  • Chronological index (newest first)
  • Detailed per-post entries (newest first)

For incremental updates:

  • Insert only new posts
  • Preserve existing manually edited commentary blocks if present

6) Validate ordering and coverage

Validate before finalizing:

  • Dates sorted newest β†’ oldest
  • Every detailed post has a matching index row
  • No duplicate URLs
  • No empty content unless explicitly unavailable

Report summary:

  • Total posts
  • New posts added
  • Duplicates removed
  • Earliest and latest post date in archive

Resources

scripts/

  • build_threads_archive.py: Convert raw post exports (CSV/JSON/JSONL) into normalized records and markdown sections.

references/

  • archive-format.md: Canonical markdown template and formatting rules for index + detailed entries.

# Supported AI Coding Agents

This skill is compatible with the SKILL.md standard and works with all major AI coding agents:

Learn more about the SKILL.md standard and how to use these skills with your preferred AI coding agent.