Use when you have a written implementation plan to execute in a separate session with review checkpoints
npx skills add Nebu1eto/skills --skill "epub-translator"
Install specific skill from multi-skill repository
# Description
Translates EPUB ebook files between languages with parallel processing. Supports Japanese, English, Chinese, and other languages. Handles large files by splitting into sections, manages multiple volumes simultaneously, and preserves EPUB structure and formatting. Includes translation quality validation. Use when translating novels, books, or any EPUB content.
# SKILL.md
name: epub-translator
description: Translates EPUB ebook files between languages with parallel processing. Supports Japanese, English, Chinese, and other languages. Handles large files by splitting into sections, manages multiple volumes simultaneously, and preserves EPUB structure and formatting. Includes translation quality validation. Use when translating novels, books, or any EPUB content.
compatibility: Requires Python 3.8+, zip/unzip commands. Optional epubcheck for validation.
allowed-tools: Read Write Edit Bash(python3:) Bash(bash:) Bash(mkdir:) Bash(find:) Bash(echo:) Bash(wc:) Bash(cat:) Bash(ls:)
metadata:
author: Haze Lee
version: "1.0.0"
category: translation
EPUB Translation Skill
Translate EPUB files between any language pair with optimized support for Japanese and English to Korean.
When to Use This Skill
Use this skill when:
- User wants to translate an EPUB ebook to another language
- User mentions translating Japanese/English/Chinese novels or books
- User has multiple EPUB files to translate in batch
- User needs to preserve EPUB formatting and structure during translation
Usage
/epub-translator <epub_path> [options]
Arguments
<epub_path>: EPUB file or directory containing EPUBs
Options
| Option | Description | Default |
|---|---|---|
--source-lang |
Source language code | ja |
--target-lang |
Target language code | ko |
--dict |
Custom dictionary (JSON) | none |
--output-dir |
Output directory | ./translated |
--parallel |
Concurrent agents | 5 |
--split-threshold |
File size for splitting (KB) | 30 |
--split-parts |
Parts to split large files | 4 |
--high-quality |
Use Opus model for translation | false |
--vertical |
Output vertical writing (ja/zh only) | false |
Language Codes
ja (Japanese), en (English), ko (Korean), zh (Chinese), es (Spanish), fr (French), de (German), ru (Russian), ar (Arabic), or any ISO 639-1 code.
Examples
# Japanese novel to Korean (default)
/epub-translator "/books/novel.epub"
# English to Korean
/epub-translator "/books/english.epub" --source-lang en
# Japanese to English
/epub-translator "/books/jp_novel.epub" --source-lang ja --target-lang en
# High-quality translation using Opus model
/epub-translator "/books/important.epub" --high-quality
# Batch with larger split threshold (less splitting)
/epub-translator "/books/" --split-threshold 50 --parallel 10
# More aggressive splitting for slower connections
/epub-translator "/books/large.epub" --split-threshold 20 --split-parts 6
# English to Japanese with vertical writing (μ°μ’
μ/ηΈ¦ζΈγ)
/epub-translator "/books/novel.epub" --source-lang en --target-lang ja --vertical
# Korean to Chinese with vertical writing
/epub-translator "/books/korean.epub" --source-lang ko --target-lang zh --vertical
Architecture
graph TB
O["ORCHESTRATOR<br/>β’ Analyzes EPUBs and creates task manifest<br/>β’ Spawns parallel translator agents (foreground)<br/>β’ Collects results directly from agent responses<br/>β’ Validates translation quality<br/>β’ Handles retries and error recovery"]
T1["Translator<br/>Agent 1"]
T2["Translator<br/>Agent 2"]
TN["Translator<br/>Agent N"]
O --> T1
O --> T2
O --> TN
Key Constraint: Sub-agents do NOT have Task tool access. They use only Read, Edit, Write, and Bash.
Execution Model: All Task agents run in foreground mode (not background). Multiple Tasks can be spawned in a single message for parallel execution, but results are collected synchronously.
Model Selection
Default (no flags)
| Task | Model |
|---|---|
| Content translation | Sonnet |
| Metadata/TOC | Haiku |
| Validation | Haiku |
With --high-quality
| Task | Model |
|---|---|
| Content translation | Opus |
| Metadata/TOC | Sonnet |
| Validation | Sonnet |
Automatic Upgrade
- If quality score < 70: Re-translate flagged files with Opus
- If translation fails: Retry with upgraded model
Execution Workflow
Phase 1: Analysis
-
Create work directory:
bash WORK_DIR="/tmp/epub_translate_$(date +%s)" mkdir -p "$WORK_DIR"/{extracted,sections,translated,status,logs} -
Analyze EPUBs (with configurable split threshold):
bash python3 scripts/analyze_epub.py \ --epub "{EPUB_PATH}" \ --work-dir "$WORK_DIR" \ --source-lang "{SOURCE_LANG}" \ --target-lang "{TARGET_LANG}" \ --split-threshold 30 \ --split-parts 4 -
Review
$WORK_DIR/manifest.jsonfor task count.
Phase 2: Translation
- Select translator prompt from references/:
- Japanese:
translator_ja.md - English:
translator_en.md -
Other:
translator_generic.md -
Spawn Task agents in foreground mode (batched):
model: "sonnet"- CRITICAL: Multiple Tasks in single message for parallel execution
- Process in batches of
--parallelcount (default: 5) -
Results are returned directly - no status file monitoring needed
-
Batch execution pattern:
```
For each batch of N tasks:- Spawn N Task agents in a single message (foreground, parallel)
- Collect results directly from agent responses
- Track completed/failed tasks
- Proceed to next batch
```
-
Retry failed tasks (max 2 attempts, upgrade to
opusif persistent)
Phase 3: Finalization
-
Merge split files:
bash python3 scripts/merge_xhtml.py --work-dir "$WORK_DIR" --manifest manifest.json -
Translate metadata and navigation (LLM-based):
- Spawn metadata translation agent using
translator_metadata.md - Translate: toc.ncx, nav.xhtml, content.opf (title, author, description)
- Translate: cover.xhtml, titlepage.xhtml (if present)
-
Ensure TOC entries match translated chapter headings
-
Apply layout conversion (CRITICAL - must be done before packaging):
Determine conversion type based on target language and --vertical option:
| Target Language | --vertical |
Result |
|---|---|---|
| ko, en, etc. | (ignored) | horizontal-tb, ltr |
| ja, zh | false (default) | horizontal-tb, ltr |
| ja, zh | true | vertical-rl, rtl (μ°μ’ μ/ηΈ¦ζΈγ) |
| ar, he, fa | (ignored) | horizontal-tb, rtl |
A. Horizontal output (default for all languages):
```bash
TRANSLATED_DIR="$WORK_DIR/translated/{VOLUME_ID}"
# Convert CSS files: vertical-rl β horizontal-tb
find "$TRANSLATED_DIR" -name ".css" -exec sed -i '' \
-e 's/writing-mode:[[:space:]]vertical-rl/writing-mode: horizontal-tb/g' \
-e 's/-webkit-writing-mode:[[:space:]]vertical-rl/-webkit-writing-mode: horizontal-tb/g' \
-e 's/-epub-writing-mode:[[:space:]]vertical-rl/-epub-writing-mode: horizontal-tb/g' \
{} \;
# Convert content.opf: page direction and writing mode
find "$TRANSLATED_DIR" -name "content.opf" -exec sed -i '' \
-e 's/page-progression-direction="rtl"/page-progression-direction="ltr"/g' \
-e 's/primary-writing-mode" content="vertical-rl"/primary-writing-mode" content="horizontal-tb"/g' \
{} \;
# Convert XHTML inline styles if present
find "$TRANSLATED_DIR" -name ".xhtml" -exec sed -i '' \
-e 's/writing-mode:[[:space:]]vertical-rl/writing-mode: horizontal-tb/g' \
{} \;
```
B. Vertical output (only when --vertical AND target is ja/zh):
```bash
TRANSLATED_DIR="$WORK_DIR/translated/{VOLUME_ID}"
# Convert CSS files: horizontal-tb β vertical-rl
find "$TRANSLATED_DIR" -name ".css" -exec sed -i '' \
-e 's/writing-mode:[[:space:]]horizontal-tb/writing-mode: vertical-rl/g' \
-e 's/-webkit-writing-mode:[[:space:]]horizontal-tb/-webkit-writing-mode: vertical-rl/g' \
-e 's/-epub-writing-mode:[[:space:]]horizontal-tb/-epub-writing-mode: vertical-rl/g' \
{} \;
# Convert content.opf: page direction and writing mode for vertical
find "$TRANSLATED_DIR" -name "content.opf" -exec sed -i '' \
-e 's/page-progression-direction="ltr"/page-progression-direction="rtl"/g' \
-e 's/primary-writing-mode" content="horizontal-tb"/primary-writing-mode" content="vertical-rl"/g' \
{} \;
# Convert XHTML inline styles if present
find "$TRANSLATED_DIR" -name ".xhtml" -exec sed -i '' \
-e 's/writing-mode:[[:space:]]horizontal-tb/writing-mode: vertical-rl/g' \
{} \;
```
C. RTL output (for ar/he/fa targets):
```bash
# Convert page direction
sed -i '' 's/page-progression-direction="ltr"/page-progression-direction="rtl"/g' "$TRANSLATED_DIR"/content.opf
# Convert CSS direction
find "$TRANSLATED_DIR" -name ".css" -exec sed -i '' \
-e 's/direction:[[:space:]]ltr/direction: rtl/g' \
{} \;
```
Note: If source is already vertical and --vertical is set, skip CSS conversion (keep existing vertical layout).
See references/layout_conversion.md for complete conversion patterns.
- Verify source text removed:
bash python3 scripts/verify.py --work-dir "$WORK_DIR" --source-lang "{SOURCE_LANG}"
Phase 4: Quality Validation (LLM-Based)
-
Extract text for validation (token-efficient format):
bash python3 scripts/extract_for_validation.py \ --dir "$WORK_DIR/translated" \ --output-dir "$WORK_DIR/validation" \ --max-tokens 8000 -
Select validator prompt from references/:
- Korean target:
validator_ko.md(extendsvalidator_generic.md) -
Other targets:
validator_generic.md -
Spawn validation Task agents in foreground mode (batched):
- Read
$WORK_DIR/validation/validation_manifest.json - For each chunk, spawn a validator agent with:
model: "haiku"(sufficient for validation)
- CRITICAL: Multiple Tasks in single message for parallel execution
-
Process in batches, collect results directly
-
Aggregate results:
- Collect validation results from agent responses
- Calculate average quality score
-
Identify files flagged for re-translation
-
If average score < 70: Re-translate flagged files with
model: "opus"
Phase 5: Packaging
-
Package EPUB:
bash bash scripts/package_epub.sh "$WORK_DIR" "{OUTPUT_DIR}" -
Generate final report with quality metrics
File Splitting Configuration
Conservative defaults prevent context overflow in translation agents:
| Setting | Default | Description |
|---|---|---|
split-threshold |
30 KB | Files larger than this are split |
split-parts |
4 | Number of sections per large file |
Tuning Guidelines
- Slow connection / Timeouts: Lower threshold (20 KB), more parts (6)
- Fast connection / Large context: Higher threshold (50 KB), fewer parts (3)
- Very large files (100KB+): Will be split into more parts automatically
Quality Validation (LLM-Based)
Translation quality is validated by LLM sub-agents, not regex patterns. This provides:
- Context-aware naturalness assessment
- Understanding of literary style and tone
- Detection of subtle translation issues
Validator Instructions
| Target Language | Primary Instruction | Base Instruction |
|---|---|---|
| Korean | validator_ko.md |
validator_generic.md |
| Other | validator_generic.md |
- |
Korean-Specific Checks
- Translationese (λ²μν¬):
~νλ κ²μ΄λ€,~λΌκ³ νλ, etc. - Pronoun overuse: Excessive
κ·Έλ λ,κ·Έλ - Particle chains: Awkward
μμμpatterns - Honorific consistency: Speech level matching
Quality Score
- 90-100: Excellent - reads naturally
- 75-89: Good - minor issues
- 60-74: Acceptable - review recommended
- <60: Poor - re-translation needed
Validation Workflow
- Text extracted in token-efficient format
- Chunked for parallel validation (8000 tokens each)
- LLM validators spawned in foreground batches
- Results collected directly from agent responses
- Results aggregated into final report
Language-Specific Processing
Source Language Handling
| Source | Special Handling |
|---|---|
| Japanese | Remove ruby tags, handle vertical writing |
| Chinese | Handle traditional/simplified, remove pinyin |
| Arabic/Hebrew | Handle RTL text direction |
| English | Standard processing |
Layout Conversion (Target-Based)
Key Principle: All languages default to horizontal LTR (except RTL languages).
| Target Language | Page Direction | Writing Mode | Text Direction | Notes |
|---|---|---|---|---|
| Korean (ko) | ltr | horizontal-tb | ltr | |
| English (en) | ltr | horizontal-tb | ltr | |
| Japanese (ja) | ltr | horizontal-tb | ltr | Default |
Japanese (ja) + --vertical |
rtl | vertical-rl | ltr | ηΈ¦ζΈγ (μ°μ’ μ) |
| Chinese (zh) | ltr | horizontal-tb | ltr | Default |
Chinese (zh) + --vertical |
rtl | vertical-rl | ltr | ηΈ±ζ (μ°μ’ μ) |
| Arabic (ar) | rtl | horizontal-tb | rtl | |
| Hebrew (he) | rtl | horizontal-tb | rtl |
Note: --vertical option is only valid for Japanese (ja) and Chinese (zh) targets. It will be ignored for other languages.
See references/layout_conversion.md for complete conversion scripts.
Custom Dictionary (Optional)
The translator works without external dictionary files. It naturally translates based on context.
Use custom dictionaries ONLY for:
- Proper nouns: names, places, organizations, brands
- Document-specific terms: proprietary terms unique to this document
Do NOT add common words - let the translator handle them naturally.
Creating a Custom Dictionary
See assets/template.json for format:
{
"proper_nouns": { "names": { "η°δΈε€ͺι": "Tanaka Taro" } },
"domain_terms": { "ProprietaryTech": "κ³ μ κΈ°μ λͺ
" }
}
Academic/Technical Template
For academic or technical documents, use assets/template_academic.json.
Work Directory Structure
$WORK_DIR/
βββ manifest.json # Task manifest
βββ extracted/ # Extracted EPUB contents
βββ sections/ # Split large files
βββ translated/ # Translated files
βββ validation/ # Validation input/output files
β βββ validation_manifest.json
β βββ validate_001_input.txt
β βββ validate_001_result.json
β βββ ...
βββ status/ # Task status files
βββ logs/ # Log files
Status Codes
| Status | Meaning |
|---|---|
pending |
Not started |
in_progress |
Being translated |
completed |
Done |
failed |
Error occurred |
Error Handling
| Error | Action |
|---|---|
| Extraction failure | Skip corrupted file |
| Translation timeout | Split further, retry |
| XML error | Attempt fix, report |
| Remaining source text | Re-translate or manual review |
| Low quality score | Review samples, re-translate if needed |
File Reference
| Path | Description |
|---|---|
SKILL.md |
This file |
references/orchestrator.md |
Detailed orchestrator instructions |
references/translator_*.md |
Language-specific translator prompts |
references/translator_metadata.md |
Metadata and TOC translation instruction |
references/layout_conversion.md |
Writing direction and layout conversion guide |
references/validator_generic.md |
Generic validation instruction |
references/validator_ko.md |
Korean-specific validation instruction |
scripts/analyze_epub.py |
EPUB analysis (configurable splitting) |
scripts/split_xhtml.py |
File splitting |
scripts/merge_xhtml.py |
Section merging |
scripts/verify.py |
Source text verification |
scripts/extract_for_validation.py |
Token-efficient text extraction for LLM validation |
scripts/package_epub.sh |
EPUB packaging |
assets/template.json |
Dictionary template |
assets/template_academic.json |
Academic dictionary template |
# Supported AI Coding Agents
This skill is compatible with the SKILL.md standard and works with all major AI coding agents:
Learn more about the SKILL.md standard and how to use these skills with your preferred AI coding agent.