Use when you have a written implementation plan to execute in a separate session with review checkpoints
npx skills add greendaygh/my-claude-skills --skill "prophage-miner"
Install specific skill from multi-skill repository
# Description
>
# SKILL.md
name: prophage-miner
description: >
This skill should be used when the user asks to "prophage ๋
ผ๋ฌธ ๋ถ์",
"prophage literature mining", "prophage gene extraction",
"๋ฐํ
๋ฆฌ์คํ์ง ํ๋กํ์ง ์ฐ๊ตฌ", "prophage knowledge graph",
"ํ๋กํ์ง ์ ์ ์ ์ถ์ถ", "prophage 10ํ ๋ฐ๋ณต", "prophage ์ฐ์ ์คํ",
or needs automated prophage-related literature collection and analysis.
user_invocable: true
Prophage Miner
PubMed์์ prophage ๊ด๋ จ ๋ ผ๋ฌธ์ ์๋ ๊ฒ์/์ ์ ํ๊ณ , PMC full text์์ ์ ์ ์/๋จ๋ฐฑ์ง/์์ฃผ ๊ฐ์ผ ์ ๋ณด๋ฅผ ์ถ์ถํ์ฌ knowledge graph๋ฅผ ๊ตฌ์ถํ๋ค. Pydantic v2 ๊ฒ์ฆ + 3์ธ ์ ๋ฌธ๊ฐ ํจ๋ ํฉ์๋ก ๋ฐ์ดํฐ ํ์ง์ ๋ณด์ฅํ๋ค. ๋ฐ๋ณต ์คํ ์ ๊ธฐ์กด ๋ฐ์ดํฐ์ ์ฆ๋ถ ์ ๋ฐ์ดํธ๋๋ค.
์ถ๋ ฅ ๋๋ ํ ๋ฆฌ: ~/dev/phage/
์คํฌ ์์น: ~/.claude/skills/prophage-miner/
Prerequisites
์์กด์ฑ ์ค์น ํ์ธ (์ต์ด 1ํ):
pip install -r ~/.claude/skills/prophage-miner/requirements.txt
Orchestration
๋จ์ผ ์คํ
์ฌ์ฉ์๊ฐ "prophage ๋ถ์ ์คํ" ๋ฑ์ ์์ฒญํ๋ฉด Phase 1-6์ ์์ฐจ ์ํํ๋ค.
Nํ ๋ฐ๋ณต ์คํ
์ฌ์ฉ์๊ฐ "10ํ ๋ฐ๋ณต" ๋ฑ์ ์ง์ํ๋ฉด for loop์ผ๋ก Phase 1-6์ ๋ฐ๋ณตํ๋ค:
for i in 1..N:
1. Phase 1: search_papers.py (๊ธฐ์กด PMID ์ ์ธ, run ์๋ ์์ฑ + ๋
ผ๋ฌธ ๋ฑ๋ก) + Pydantic ๊ฒ์ฆ
2. Phase 2: fetch_fulltext.py (๋ฏธ๋ค์ด๋ก๋๋ง)
3. Phase 3: ์๋ธ์์ด์ ํธ ์์ (๋ฏธ์ถ์ถ๋ง, 4 ๋ณ๋ ฌ) + Pydantic ๊ฒ์ฆ
4. Phase 4: Expert Panel (i <= 4์ด๋ฉด Full Panel, i >= 5์ด๋ฉด Quick Panel)
5. Phase 5: build_graph.py (์น์ธ๋ ์ถ์ถ๋ง ์ฌ๊ตฌ์ถ) + Pydantic ๊ฒ์ฆ
6. Phase 6: generate_report.py
7. run_tracker.complete_run(run_id) # run_id๋ search_papers๊ฐ ๋ฐํ
8. Report: "Run {i}/{N} | ๋์ {total}ํธ | ๊ทธ๋ํ {nodes}๋
ธ๋ {edges}์ฃ์ง"
์ต์ข
: run_tracker.summary() ์ถ๋ ฅ
์ถ๋ ฅ ๋๋ ํ ๋ฆฌ ์ด๊ธฐํ
์ต์ด ์คํ ์ ๋ค์ ๋๋ ํ ๋ฆฌ ๊ตฌ์กฐ๋ฅผ ์์ฑํ๋ค:
mkdir -p ~/dev/phage/{00_config,01_papers/full_texts,02_extractions/per_paper,03_graph/exports,04_analysis,05_reports}
cp ~/.claude/skills/prophage-miner/assets/prophage_schema.json ~/dev/phage/00_config/schema.json
Phase 1: PubMed Search (์๋, ์ฆ๋ถ)
์ฌ์ ์ ์๋ ํค์๋๋ก PubMed์ ๊ฒ์ํ๊ณ ๊ธฐ์กด ์์ง PMID๋ฅผ ์ ์ธํ ํ ~20ํธ์ ๋๋ค ์ ์ ํ๋ค.
์คํ:
cd ~/.claude/skills/prophage-miner
python -m scripts.search_papers \
--output ~/dev/phage \
--exclude-file ~/dev/phage/00_config/run_registry.json \
--select-n 20
# search_papers๊ฐ ์๋์ผ๋ก run์ ์์ฑํ๊ณ ๋
ผ๋ฌธ์ run_registry์ ๋ฑ๋กํ๋ค.
# ์ค์ผ์คํธ๋ ์ดํฐ๊ฐ ์ด๋ฏธ run์ ์์ฑํ ๊ฒฝ์ฐ: --run-id run_002
Pydantic ๊ฒ์ฆ:
python -m scripts.validate_data --papers ~/dev/phage/01_papers/paper_list.json
๊ฒ์ฆ ์คํจ ์ ์๋ฌ ๋ชฉ๋ก์ ์ถ๋ ฅํ๊ณ ์๋ ์์ ์ ์๋ํ๋ค. ์ฌ๊ฐํ ์๋ฐ ์ Phase 1์ ์ฌ์คํํ๋ค.
Phase 2: Full Text Download (์๋, ์ฆ๋ถ)
has_full_text: false์ธ ๋
ผ๋ฌธ๋ง PMC/Europe PMC์์ full text๋ฅผ ๋ค์ด๋ก๋ํ๋ค.
์คํ:
python -m scripts.fetch_fulltext \
--input ~/dev/phage/01_papers/paper_list.json \
--output ~/dev/phage \
--pending-only
๋ค์ด๋ก๋ ์คํจ ์ abstract๋ง์ผ๋ก ์งํ (Phase 3์์ confidence ํ๋ํฐ).
Phase 3: Prophage Extraction (์๋ธ์์ด์ ํธ ์์)
ํต์ฌ: ์ด Phase๋ ๋ฉ์ธ ์์ด์ ํธ๊ฐ ์ง์ ์ถ์ถํ์ง ์๊ณ , ์๋ธ์์ด์ ํธ์ ์์ํ์ฌ ์ปจํ ์คํธ ์๋์ฐ๋ฅผ ๋ณดํธํ๋ค.
์ ์ฐจ
- ๋ฏธ์ถ์ถ ๋ ผ๋ฌธ ๋ชฉ๋ก์ ์กฐํํ๋ค:
import sys; sys.path.insert(0, str(Path.home() / ".claude/skills/prophage-miner"))
from scripts.run_tracker import RunTracker
tracker = RunTracker(Path.home() / "dev/phage")
pending = tracker.get_pending_extractions()
- pending ๋ ผ๋ฌธ์ 4ํธ์ฉ ๋ฌถ์ด ๋ณ๋ ฌ ์๋ธ์์ด์ ํธ์ ์์ํ๋ค (Task ๋๊ตฌ, subagent_type="generalPurpose"):
๊ฐ ์๋ธ์์ด์ ํธ์ ์ ๋ฌํ ํ๋กฌํํธ:
You are a prophage biology extraction specialist.
1. Read ~/.claude/skills/prophage-miner/references/extraction_prompts.md for extraction guidelines.
2. Read ~/.claude/skills/prophage-miner/references/prophage_biology.md for domain context.
3. Read ~/dev/phage/00_config/schema.json for the entity/relationship schema.
4. Read ~/dev/phage/01_papers/full_texts/{paper_id}.txt for the full text.
5. Extract ALL prophage-related entities and relationships following the schema.
6. Apply section-based confidence weights:
- Results: 0.9, Methods: 0.85, Abstract: 0.85, Introduction: 0.7, Discussion: 0.6
- Abstract-only papers: apply -0.2 penalty
7. Save the extraction result:
python -m scripts.extract_prophage save \
--paper-id {paper_id} \
--output ~/dev/phage/02_extractions/per_paper/
Provide the extraction JSON via stdin.
8. Return a brief summary ONLY: entity count, relationship count, key prophage names found.
- ์๋ธ์์ด์ ํธ ์๋ฃ ํ, ๋ฉ์ธ ์์ด์ ํธ๊ฐ ์ํ๋ฅผ ์ ๋ฐ์ดํธํ๋ค:
tracker.mark_extracted(paper_id)
# ๋๋ ์คํจ ์:
tracker.mark_extract_failed(paper_id, "reason")
- Pydantic ๊ฒ์ฆ (๊ฐ ์ถ์ถ ๊ฒฐ๊ณผ):
python -m scripts.validate_data --extraction ~/dev/phage/02_extractions/per_paper/{paper_id}_extraction.json
Phase 4: Expert Panel Review (3์ธ ์์ ํ ๋ก + ํฉ์)
Read ~/.claude/skills/prophage-miner/references/panel_protocol.md for the full protocol.
Read ~/.claude/skills/prophage-miner/assets/panel_config.json for panel configuration.
Full Panel (๊ธฐ๋ณธ, ์ฒ์ 4ํ)
Round 1 - ๋ ๋ฆฝ ๊ฒํ (3์ธ ๋ณ๋ ฌ ์๋ธ์์ด์ ํธ):
๊ฐ ์ ๋ฌธ๊ฐ์๊ฒ ์ ๋ฌํ ์
๋ ฅ:
- ์ด๋ฒ run์ ์ถ์ถ ์์ฝ (์ํฐํฐ/๊ด๊ณ ํ์
๋ณ ๊ฐ์ + ๋ํ ์์)
- ์ ๋ขฐ๋ ๋ถํฌ (min, max, mean)
- unschemaed ๋ฐ๊ฒฌ ๋ชฉ๋ก
๊ฐ ์ ๋ฌธ๊ฐ(์๋ธ์์ด์ ํธ) ํ๋กฌํํธ:
You are {expert_name}, {persona_description}.
Review the following extraction summary and evaluate:
- Entity accuracy (accept/flag with reason)
- Relationship plausibility (accept/flag with reason)
- Missing entities or relationships
- Schema improvement suggestions
Input: {extraction_summary}
Output your assessment as JSON with: assessments (per paper), schema_suggestions, overall_quality.
Round 2 - ์์ ํ ๋ก (์์ฐจ ์๋ธ์์ด์ ํธ):
- Round 1์ 3์ธ ์๊ฒฌ์ ๋ชจ๋ ๊ณต๊ฐ
- ๊ฐ ์ ๋ฌธ๊ฐ๊ฐ ๋ค๋ฅธ ์๊ฒฌ์ ์ฐธ๊ณ ํ์ฌ ์ฌํ๊ฐ
- ์ต๋ 2ํ ์๋ณต
Round 3 - ํฉ์ ํฌํ:
- ๊ฐ ์ ๋ฌธ๊ฐ ์ต์ข
ํ์ : accept / flag_recheck / flag_reextract / reject
- 2/3 ์ด์ ๋์ ์ ํฉ์๋ก ํ์
Quick Panel (5ํ ์ด์ ์ฐ์)
์กฐ๊ฑด: ์ฐ์ 5ํ ์ด์ + ํ๊ท panel_confidence >= 0.8
- Round 1๋ง ์ํ
- ์ฌ๊ฐํ flag๊ฐ ์์ผ๋ฉด ์๋ ์น์ธ
- ์ฌ๊ฐํ flag ๋ฐ๊ฒฌ ์ ์ฆ์ Full Panel๋ก ๋ณต๊ท
ํ์ ์ฒ๋ฆฌ
accept: ๊ทธ๋ํ์ ํฌํจflag_reextract: extraction_status๋ฅผ "pending"์ผ๋ก ๋ณต์ โ Phase 3์์ ์ฌ์ถ์ถflag_recheck: ๋ฉ์ธ ์์ด์ ํธ๊ฐ extraction์ ์ง์ ์์ ํ ์ฌ๊ฒ์ฆreject: extraction ์ญ์ , extraction_status๋ฅผ "rejected", ๊ทธ๋ํ์์ ์ ์ธ
Phase 5: Knowledge Graph Construction (์๋, idempotent)
์น์ธ๋ ์ถ์ถ ๊ฒฐ๊ณผ๋ง ํตํฉํ์ฌ ๊ทธ๋ํ๋ฅผ ์ฌ๊ตฌ์ถํ๋ค.
์คํ:
python -m scripts.build_graph \
--input ~/dev/phage/02_extractions/per_paper \
--output ~/dev/phage/03_graph \
--registry ~/dev/phage/00_config/run_registry.json
Pydantic ๊ฒ์ฆ:
python -m scripts.validate_data --graph ~/dev/phage/03_graph/
์ฐธ์กฐ ๋ฌด๊ฒฐ์ฑ ๊ฒ์ฆ: ๋ชจ๋ edge์ from_id/to_id๊ฐ ์กด์ฌํ๋ node๋ฅผ ๊ฐ๋ฆฌํค๋์ง ํ์ธ.
Phase 6: Analysis & Report (์๋, idempotent)
๊ทธ๋ํ ๋ฐ์ดํฐ๋ฅผ ๊ธฐ๋ฐ์ผ๋ก ๋ถ์ ์นดํ๋ก๊ทธ์ ๋ฆฌํฌํธ๋ฅผ ์์ฑํ๋ค.
์คํ:
python -m scripts.generate_report \
--input ~/dev/phage/03_graph \
--output ~/dev/phage
์ถ๋ ฅ ํ์ผ:
- 04_analysis/prophage_catalog.json: ๋ฐ๊ฒฌ๋ prophage ์นดํ๋ก๊ทธ
- 04_analysis/host_range_matrix.json: ํธ์คํธ-ํ์ง ๊ฐ์ผ ๋ฒ์
- 04_analysis/gene_inventory.json: ์ ์ ์/๋จ๋ฐฑ์ง ์ธ๋ฒคํ ๋ฆฌ
- 05_reports/research_report.md: ์๋ฌธ ์ฐ๊ตฌ ๋ฆฌํฌํธ
Run Completion
๋งค run ์ข ๋ฃ ์:
tracker.complete_run(run_id)
s = tracker.summary()
๋์ ํต๊ณ ์ถ๋ ฅ:
Run {i}/{N} completed
Total: {total_papers} papers | Extracted: {extracted} | Failed: {failed}
Graph: {nodes} nodes, {edges} edges
Panel confidence: {confidence}
Schema Reference
์คํค๋ง ํ์ผ: ~/dev/phage/00_config/schema.json
์ํฐํฐ ํ์ (8์ข )
Prophage, Gene, Protein, Host, IntegrationSite, Receptor, InductionCondition, Paper
๊ด๊ณ ํ์ (10์ข )
ENCODES, TRANSLATES_TO, INTEGRATES_INTO, INFECTS, BINDS, REPRESSES, INDUCES, HOMOLOGOUS_TO, LYSIS_COMPONENT, EXTRACTED_FROM
์์ธ ์ ์๋ ~/.claude/skills/prophage-miner/assets/prophage_schema.json ์ฐธ์กฐ.
Stability Notes
- ์ปจํ ์คํธ ๋ณดํธ: ๋ฉ์ธ ์์ด์ ํธ๋ full text๋ฅผ ์ง์ ๋ก๋ํ์ง ์์. ์คํฌ๋ฆฝํธ ์คํ + ์๋ธ์์ด์ ํธ ์์๋ง ์ํ
- ํ์ผ ๊ธฐ๋ฐ ์ํ: ๋ชจ๋ ์ํ๊ฐ run_registry.json์ ์์์ ์ผ๋ก ์ ์ฅ
- ์คํจ ๊ฒฉ๋ฆฌ: ์๋ธ์์ด์ ํธ ์คํจ ์ ํด๋น ๋ ผ๋ฌธ๋ง failed๋ก ํ์, ๋๋จธ์ง ๊ณ์ ์งํ
- ์ฌ์๋: ๋ค์ run์์ failed ๋ ผ๋ฌธ์ ์๋ ์ฌ์๋ (pending์ผ๋ก ๋ณต์)
- Idempotent Phase 5/6: ๊ทธ๋ํ/๋ฆฌํฌํธ๋ ํญ์ ์ ์ฒด ์ฌ๊ตฌ์ถ์ด๋ฏ๋ก ์ธ์ ๋ ์์
# Supported AI Coding Agents
This skill is compatible with the SKILL.md standard and works with all major AI coding agents:
Learn more about the SKILL.md standard and how to use these skills with your preferred AI coding agent.