Refactor high-complexity React components in Dify frontend. Use when `pnpm analyze-component...
npx skills add linxule/interpretive-orchestration --skill "document-conversion"
Install specific skill from multi-skill repository
# Description
This skill should be used when users need to convert PDFs (especially with tables or figures), mentions 'convert', 'PDF', 'document processing', has complex academic papers to import, or asks about MinerU vs Markdownify.
# SKILL.md
name: document-conversion
description: "This skill should be used when users need to convert PDFs (especially with tables or figures), mentions 'convert', 'PDF', 'document processing', has complex academic papers to import, or asks about MinerU vs Markdownify."
document-conversion
Robust PDF and document conversion with intelligent tool selection. Chooses the best available conversion method based on document complexity and MCP availability.
When to Use
Use this skill when:
- User needs to convert PDFs, especially with tables or figures
- User mentions "convert", "PDF", "document processing"
- User has complex academic papers to import
- User asks about MinerU vs Markdownify
MCP Comparison
| Feature | MinerU (Optional) | Markdownify (Bundled) |
|---|---|---|
| API Key Required | Yes (MINERU_API_KEY) | No |
| PDF Accuracy | 90%+ (VLM mode) | Good |
| Table Extraction | Excellent | Basic |
| Figure Handling | Extracts + describes | Basic |
| Formula Recognition | Yes | Limited |
| Multi-column | Excellent | Good |
| Audio Transcription | No | Yes |
| Cost | Pay per page | Free |
When to Use Which
Use MinerU When:
- PDF has complex tables with merged cells
- Document has multi-column layouts
- Figures/charts need extraction
- Mathematical formulas present
- Academic paper with structured formatting
- Accuracy is critical
Use Markdownify When:
- Simple text-based documents
- Audio files need transcription
- No API key available
- Cost is a concern
- Document is straightforward
Tool Selection Logic
Is the document a PDF with tables/figures?
βββ Yes, complex tables
β βββ MinerU available?
β βββ Yes β Use MinerU (vlm mode)
β βββ No β Markdownify + manual review
βββ Yes, simple formatting
β βββ Markdownify (good enough)
βββ No, other format
βββ Is it audio?
βββ Yes β Markdownify
βββ No β Markdownify (supports many formats)
Usage Examples
MinerU (Complex PDF)
Use mineru_parse to convert this academic paper:
- URL: https://example.com/paper.pdf
- Model: vlm (for 90% accuracy)
- Enable: formula, table recognition
Markdownify (Simple Document)
Use markdownify pdf-to-markdown for this interview guide
Batch Processing
For multiple PDFs:
1. Check which have complex tables (use MinerU)
2. Process simple ones with Markdownify
3. Queue complex ones for MinerU batch
MinerU Specific Features
VLM vs Pipeline Mode
- VLM Mode: Uses vision-language model, 90%+ accuracy, slower
- Pipeline Mode: Traditional parsing, faster, lower accuracy
Page Selection
Parse only specific pages:
mineru_parse({
url: "https://...",
pages: "1-10,15,20-25"
})
Batch Processing
Process multiple documents:
mineru_batch({
urls: ["url1", "url2", "url3"],
model: "vlm"
})
Output Quality Checklist
After conversion, verify:
- [ ] Text is accurately extracted
- [ ] Tables maintain structure
- [ ] Headers/sections are correct
- [ ] Figures have descriptions (if MinerU)
- [ ] Formulas are readable (if MinerU)
- [ ] No garbled text from OCR errors
Integration with Research Workflow
For Literature (Stream A)
- Identify papers to convert
- Complex papers β MinerU
- Simple papers β Markdownify
- Store in stream-a-theoretical/papers/
For Data Documents (Stream B)
- Interview transcripts β Markdownify (audio)
- PDF field notes β Markdownify or MinerU
- Store in appropriate stage folder
Fallback Options
If both tools fail or unavailable:
- Adobe Acrobat - Export to Word
- Google Docs - Open PDF for auto-OCR
- Tesseract OCR - Command-line tool
- Manual transcription - Last resort
Related
- MCPs: MinerU (optional), Markdownify (bundled)
- Skills: interview-ingest for audio, literature-sweep for papers
- Configuration: .mcp.json defines MCP availability
# Supported AI Coding Agents
This skill is compatible with the SKILL.md standard and works with all major AI coding agents:
Learn more about the SKILL.md standard and how to use these skills with your preferred AI coding agent.