eovidiu

elevenlabs-tts

2
0
# Install this skill:
npx skills add eovidiu/agents-skills --skill "elevenlabs-tts"

Install specific skill from multi-skill repository

# Description

Generate audio tracks from text using ElevenLabs eleven_v3 model. Presents available voices, allows language selection, and lets you review/modify text before generating. Requires ELEVENLABS_API_KEY environment variable.

# SKILL.md


name: elevenlabs-tts
description: Generate audio tracks from text using ElevenLabs eleven_v3 model. Presents available voices, allows language selection, and lets you review/modify text before generating. Requires ELEVENLABS_API_KEY environment variable.


ElevenLabs Text-to-Speech

Overview

This skill generates high-quality audio tracks from text using ElevenLabs' eleven_v3 model - their most advanced speech synthesis model with natural, life-like speech, high emotional range, and support for 70+ languages.

Prerequisites

Before using this skill, ensure:
1. The ELEVENLABS_API_KEY environment variable is set
2. You have an ElevenLabs account with API access

IMPORTANT: Interactive Workflow

When using this skill, you MUST follow this workflow using AskUserQuestion:

Step 1: Fetch available voices

python scripts/elevenlabs_tts.py voices

Step 2: Ask user to select a voice

Use AskUserQuestion to present voice options. Include 3-4 popular voices plus "Other".

Step 3: Ask user to select a language

Use AskUserQuestion to present language options:
- English (en), Spanish (es), French (fr), German (de), etc.
- Include "Other" for custom ISO 639-1 codes

Step 4: Get the text to convert

Either receive text from user or ask them to provide it.

Step 5: Show text for confirmation

Use AskUserQuestion to display the final text and ask user to confirm or edit.

Step 6: Generate audio

Only after user confirms, call:

python scripts/elevenlabs_tts.py generate "TEXT" --voice VOICE_ID --language CODE --output file.mp3 --yes

Script Commands

List Voices

python scripts/elevenlabs_tts.py voices

Generate Audio (with explicit options)

python scripts/elevenlabs_tts.py generate "Your text" \
  --voice VOICE_ID \
  --language CODE \
  --output file.mp3 \
  --yes

Setup Dependencies

python scripts/elevenlabs_tts.py setup

Core Capabilities

1. Voice Selection

The skill fetches all voices available to your account and displays them with:
- Voice name
- Voice ID
- Labels (gender, age, accent, use case)

Select by entering the number or voice ID.

2. Language Selection

Eleven v3 supports 70+ languages. Common languages are presented for quick selection:

Code Language
en English
es Spanish
fr French
de German
it Italian
pt Portuguese
zh Chinese
ja Japanese
ko Korean

Enter 0 for custom ISO 639-1 codes. See references/language_codes.md for the full list.

3. Text Review

Before generating, use AskUserQuestion to display the text for final review. The user can:
- Confirm the text as-is
- Request modifications

This prevents wasted API calls on typos or last-minute changes.

4. Audio Generation

The script calls the ElevenLabs TTS API with:
- Model: eleven_v3 (latest, most expressive)
- Output format: MP3 (44.1kHz, 128kbps)
- Your selected voice and language

Command Reference

# List voices
python scripts/elevenlabs_tts.py voices

# Generate with explicit options (use after collecting choices via AskUserQuestion)
python scripts/elevenlabs_tts.py generate "Text" \
  --voice VOICE_ID \
  --language CODE \       # ISO 639-1 code (e.g., en, fr, de, ro)
  --output file.mp3 \
  --yes                   # Skip stdin confirmation (already confirmed via AskUserQuestion)

# Setup dependencies
python scripts/elevenlabs_tts.py setup
python scripts/elevenlabs_tts.py setup --force

Audio Tags (Eleven v3 Feature)

Eleven v3 supports inline audio tags for expressive control:

[slowly] Back then... [chuckles] we had no phones.
[whispers] Just dirt roads and [coughs] big dreams.
[sad] Then it happened...

Include these tags in your text to control delivery.

Workflow Integration

When using this skill in a workflow:

  1. For articles/blog posts: Generate audio narration
    bash python scripts/elevenlabs_tts.py generate "$(cat article.txt)" --voice VOICE_ID --language en --output narration.mp3 --yes

  2. For podcasts: Generate intro/outro segments
    bash python scripts/elevenlabs_tts.py generate "[upbeat] Welcome to Tech Talk, your weekly dose of innovation!" --voice VOICE_ID --language en --output intro.mp3 --yes

  3. For multilingual content: Generate in target language
    bash python scripts/elevenlabs_tts.py generate "Bienvenue sur notre site" --voice VOICE_ID --language fr --output french_welcome.mp3 --yes

Resources

  • references/language_codes.md: Complete list of 70+ supported language codes

Troubleshooting

API Key Not Found

Error: No API key found. Set ELEVENLABS_API_KEY environment variable.

Set your API key:

export ELEVENLABS_API_KEY="your-key-here"

Voice Not Found

Use voices command to list available voices and copy the exact voice ID.

Rate Limiting

If you encounter rate limits, wait before retrying. Consider upgrading your ElevenLabs plan for higher quotas.

Language Not Supported

Eleven v3 supports 70+ languages. If your language isn't working:
- Verify the ISO 639-1 code is correct (2-letter codes like 'en', 'fr', 'de')
- Check references/language_codes.md for supported languages
- Some languages may require specific voice types

# Supported AI Coding Agents

This skill is compatible with the SKILL.md standard and works with all major AI coding agents:

Learn more about the SKILL.md standard and how to use these skills with your preferred AI coding agent.