jgarrison929

elevenlabs-voices

0
0
# Install this skill:
npx skills add jgarrison929/openclaw-skills --skill "elevenlabs-voices"

Install specific skill from multi-skill repository

# Description

High-quality voice synthesis with 18 personas, 32 languages, sound effects, batch processing, and voice design using ElevenLabs API.

# SKILL.md


name: elevenlabs-voices
version: 2.1.0
description: High-quality voice synthesis with 18 personas, 32 languages, sound effects, batch processing, and voice design using ElevenLabs API.
tags: [tts, voice, speech, elevenlabs, audio, sound-effects, voice-design, multilingual]


ElevenLabs Voice Personas v2.1

Comprehensive voice synthesis toolkit using ElevenLabs API.

๐Ÿš€ First Run - Setup Wizard

When you first use this skill (no config.json exists), run the interactive setup wizard:

python3 scripts/setup.py

The wizard will guide you through:
1. API Key - Enter your ElevenLabs API key (required)
2. Default Voice - Choose from popular voices (Rachel, Adam, Bella, etc.)
3. Language - Set your preferred language (32 supported)
4. Audio Quality - Standard or high quality output
5. Cost Tracking - Enable usage and cost monitoring
6. Budget Limit - Optional monthly spending cap

๐Ÿ”’ Privacy: Your API key is stored locally in config.json only. It never leaves your machine and is automatically excluded from git via .gitignore.

To reconfigure at any time, simply run the setup wizard again.


โœจ Features

  • 18 Voice Personas - Carefully curated voices for different use cases
  • 32 Languages - Multi-language synthesis with the multilingual v2 model
  • Streaming Mode - Real-time audio output as it generates
  • Sound Effects (SFX) - AI-generated sound effects from text prompts
  • Batch Processing - Process multiple texts in one go
  • Cost Tracking - Monitor character usage and estimated costs
  • Voice Design - Create custom voices from descriptions
  • Pronunciation Dictionary - Custom word pronunciation rules
  • OpenClaw Integration - Works with OpenClaw's built-in TTS

๐ŸŽ™๏ธ Available Voices

Voice Accent Gender Persona Best For
rachel ๐Ÿ‡บ๐Ÿ‡ธ US female warm Conversations, tutorials
adam ๐Ÿ‡บ๐Ÿ‡ธ US male narrator Documentaries, audiobooks
bella ๐Ÿ‡บ๐Ÿ‡ธ US female professional Business, presentations
brian ๐Ÿ‡บ๐Ÿ‡ธ US male comforting Meditation, calm content
george ๐Ÿ‡ฌ๐Ÿ‡ง UK male storyteller Audiobooks, storytelling
alice ๐Ÿ‡ฌ๐Ÿ‡ง UK female educator Tutorials, explanations
callum ๐Ÿ‡บ๐Ÿ‡ธ US male trickster Playful, gaming
charlie ๐Ÿ‡ฆ๐Ÿ‡บ AU male energetic Sports, motivation
jessica ๐Ÿ‡บ๐Ÿ‡ธ US female playful Social media, casual
lily ๐Ÿ‡ฌ๐Ÿ‡ง UK female actress Drama, elegant content
matilda ๐Ÿ‡บ๐Ÿ‡ธ US female professional Corporate, news
river ๐Ÿ‡บ๐Ÿ‡ธ US neutral neutral Inclusive, informative
roger ๐Ÿ‡บ๐Ÿ‡ธ US male casual Podcasts, relaxed
daniel ๐Ÿ‡ฌ๐Ÿ‡ง UK male broadcaster News, announcements
eric ๐Ÿ‡บ๐Ÿ‡ธ US male trustworthy Business, corporate
chris ๐Ÿ‡บ๐Ÿ‡ธ US male friendly Tutorials, approachable
will ๐Ÿ‡บ๐Ÿ‡ธ US male optimist Motivation, uplifting
liam ๐Ÿ‡บ๐Ÿ‡ธ US male social YouTube, social media

๐ŸŽฏ Quick Presets

  • default โ†’ rachel (warm, friendly)
  • narrator โ†’ adam (documentaries)
  • professional โ†’ matilda (corporate)
  • storyteller โ†’ george (audiobooks)
  • educator โ†’ alice (tutorials)
  • calm โ†’ brian (meditation)
  • energetic โ†’ liam (social media)
  • trustworthy โ†’ eric (business)
  • neutral โ†’ river (inclusive)
  • british โ†’ george
  • australian โ†’ charlie
  • broadcaster โ†’ daniel (news)

๐ŸŒ Supported Languages (32)

The multilingual v2 model supports these languages:

Code Language Code Language
en English pl Polish
de German nl Dutch
es Spanish sv Swedish
fr French da Danish
it Italian fi Finnish
pt Portuguese no Norwegian
ru Russian tr Turkish
uk Ukrainian cs Czech
ja Japanese sk Slovak
ko Korean hu Hungarian
zh Chinese ro Romanian
ar Arabic bg Bulgarian
hi Hindi hr Croatian
ta Tamil el Greek
id Indonesian ms Malay
vi Vietnamese th Thai
# Synthesize in German
python3 tts.py --text "Guten Tag!" --voice rachel --lang de

# Synthesize in French
python3 tts.py --text "Bonjour le monde!" --voice adam --lang fr

# List all languages
python3 tts.py --languages

๐Ÿ’ป CLI Usage

Basic Text-to-Speech

# List all voices
python3 scripts/tts.py --list

# Generate speech
python3 scripts/tts.py --text "Hello world" --voice rachel --output hello.mp3

# Use a preset
python3 scripts/tts.py --text "Breaking news..." --voice broadcaster --output news.mp3

# Multi-language
python3 scripts/tts.py --text "Bonjour!" --voice rachel --lang fr --output french.mp3

Streaming Mode

Generate audio with real-time streaming (good for long texts):

# Stream audio as it generates
python3 scripts/tts.py --text "This is a long story..." --voice adam --stream

# Streaming with custom output
python3 scripts/tts.py --text "Chapter one..." --voice george --stream --output chapter1.mp3

Batch Processing

Process multiple texts from a file:

# From newline-separated text file
python3 scripts/tts.py --batch texts.txt --voice rachel --output-dir ./audio

# From JSON file
python3 scripts/tts.py --batch batch.json --output-dir ./output

JSON batch format:

[
  {"text": "First line", "voice": "rachel", "output": "line1.mp3"},
  {"text": "Second line", "voice": "adam", "output": "line2.mp3"},
  {"text": "Third line"}
]

Simple text format (one per line):

Hello, this is the first sentence.
This is the second sentence.
And this is the third.

Usage Statistics

# Show usage stats and cost estimates
python3 scripts/tts.py --stats

# Reset statistics
python3 scripts/tts.py --reset-stats

๐ŸŽต Sound Effects (SFX)

Generate AI-powered sound effects from text descriptions:

# Generate a sound effect
python3 scripts/sfx.py --prompt "Thunder rumbling in the distance"

# With specific duration (0.5-22 seconds)
python3 scripts/sfx.py --prompt "Cat meowing" --duration 3 --output cat.mp3

# Adjust prompt influence (0.0-1.0)
python3 scripts/sfx.py --prompt "Footsteps on gravel" --influence 0.5

# Batch SFX generation
python3 scripts/sfx.py --batch sounds.json --output-dir ./sfx

# Show prompt examples
python3 scripts/sfx.py --examples

Example prompts:
- "Thunder rumbling in the distance"
- "Cat purring contentedly"
- "Typing on a mechanical keyboard"
- "Spaceship engine humming"
- "Coffee shop background chatter"


๐ŸŽจ Voice Design

Create custom voices from text descriptions:

# Basic voice design
python3 scripts/voice-design.py --gender female --age middle_aged --accent american \
  --description "A warm, motherly voice"

# With custom preview text
python3 scripts/voice-design.py --gender male --age young --accent british \
  --text "Welcome to the adventure!" --output preview.mp3

# Save to your ElevenLabs library
python3 scripts/voice-design.py --gender female --age young --accent american \
  --description "Energetic podcast host" --save "MyHost"

# List all design options
python3 scripts/voice-design.py --options

Voice Design Options:

Option Values
Gender male, female, neutral
Age young, middle_aged, old
Accent american, british, african, australian, indian, latin, middle_eastern, scandinavian, eastern_european
Accent Strength 0.3-2.0 (subtle to strong)

๐Ÿ“– Pronunciation Dictionary

Customize how words are pronounced:

Edit pronunciations.json:

{
  "rules": [
    {
      "word": "OpenClaw",
      "replacement": "Open Claw",
      "comment": "Pronounce as two words"
    },
    {
      "word": "API",
      "replacement": "A P I",
      "comment": "Spell out acronym"
    }
  ]
}

Usage:

# Pronunciations are applied automatically
python3 scripts/tts.py --text "The OpenClaw API is great" --voice rachel

# Disable pronunciations
python3 scripts/tts.py --text "The API is great" --voice rachel --no-pronunciations

๐Ÿ’ฐ Cost Tracking

The skill tracks your character usage and estimates costs:

python3 scripts/tts.py --stats

Output:

๐Ÿ“Š ElevenLabs Usage Statistics

  Total Characters: 15,230
  Total Requests:   42
  Since:            2024-01-15

๐Ÿ’ฐ Estimated Costs:
  Starter    $4.57 ($0.30/1k chars)
  Creator    $3.66 ($0.24/1k chars)
  Pro        $2.74 ($0.18/1k chars)
  Scale      $1.68 ($0.11/1k chars)

๐Ÿค– OpenClaw TTS Integration

Using with OpenClaw's Built-in TTS

OpenClaw has built-in TTS support that can use ElevenLabs. Configure in ~/.openclaw/openclaw.json:

{
  "tts": {
    "enabled": true,
    "provider": "elevenlabs",
    "elevenlabs": {
      "apiKey": "your-api-key-here",
      "voice": "rachel",
      "model": "eleven_multilingual_v2"
    }
  }
}

Triggering TTS in Chat

In OpenClaw conversations:
- Use /tts on to enable automatic TTS
- Use the tts tool directly for one-off speech
- Request "read this aloud" or "speak this"

Using Skill Scripts from OpenClaw

# OpenClaw can run these scripts directly
exec python3 /path/to/skills/elevenlabs-voices/scripts/tts.py --text "Hello" --voice rachel

โš™๏ธ Configuration

The scripts look for API key in this order:

  1. ELEVEN_API_KEY or ELEVENLABS_API_KEY environment variable
  2. OpenClaw config (~/.openclaw/openclaw.json โ†’ tts.elevenlabs.apiKey)
  3. Skill-local .env file

Create .env file:

echo 'ELEVEN_API_KEY=your-key-here' > .env

๐ŸŽ›๏ธ Voice Settings

Each voice has tuned settings for optimal output:

Setting Range Description
stability 0.0-1.0 Higher = consistent, lower = expressive
similarity_boost 0.0-1.0 How closely to match original voice
style 0.0-1.0 Exaggeration of speaking style

๐Ÿ“ Triggers

  • "use {voice_name} voice"
  • "speak as {persona}"
  • "list voices"
  • "voice settings"
  • "generate sound effect"
  • "design a voice"

๐Ÿ“ Files

elevenlabs-voices/
โ”œโ”€โ”€ SKILL.md              # This documentation
โ”œโ”€โ”€ README.md             # Quick start guide
โ”œโ”€โ”€ config.json           # Your local config (created by setup, in .gitignore)
โ”œโ”€โ”€ voices.json           # Voice definitions & settings
โ”œโ”€โ”€ pronunciations.json   # Custom pronunciation rules
โ”œโ”€โ”€ examples.md           # Detailed usage examples
โ”œโ”€โ”€ scripts/
โ”‚   โ”œโ”€โ”€ setup.py          # Interactive setup wizard
โ”‚   โ”œโ”€โ”€ tts.py            # Main TTS script
โ”‚   โ”œโ”€โ”€ sfx.py            # Sound effects generator
โ”‚   โ””โ”€โ”€ voice-design.py   # Voice design tool
โ””โ”€โ”€ references/
    โ””โ”€โ”€ voice-guide.md    # Voice selection guide


๐Ÿ“‹ Changelog

v2.1.0

  • Added interactive setup wizard (scripts/setup.py)
  • Onboarding guides through API key, voice, language, quality, and budget settings
  • Config stored locally in config.json (added to .gitignore)
  • Professional, privacy-focused setup experience

v2.0.0

  • Added 32 language support with --lang parameter
  • Added streaming mode with --stream flag
  • Added sound effects generation (sfx.py)
  • Added batch processing with --batch flag
  • Added cost tracking with --stats flag
  • Added voice design tool (voice-design.py)
  • Added pronunciation dictionary support
  • Added OpenClaw TTS integration documentation
  • Improved error handling and progress output

# Supported AI Coding Agents

This skill is compatible with the SKILL.md standard and works with all major AI coding agents:

Learn more about the SKILL.md standard and how to use these skills with your preferred AI coding agent.