OpenAI's general-purpose speech recognition model. Supports 99 languages, transcription, translation to English, and language identification. Six model sizes from tiny (39M params) to large (1550M...
OpenAI's general-purpose speech recognition model. Supports 99 languages, transcription, translation to English, and language identification. Six model sizes from tiny (39M params) to large (1550M...
OpenAI's general-purpose speech recognition model. Supports 99 languages, transcription, translation to English, and language identification. Six model sizes from tiny (39M params) to large (1550M...
Global content at the speed of AI. This skill covers the full spectrum of AI-powered localization: translation, cultural adaptation, voice localization, visual adaptation, and the orchestration...
The art and science of creating AI-powered digital presenters, avatars, and synthetic spokespersons. This skill covers HeyGen, Synthesia, D-ID, Tavus, and the emerging landscape of photorealistic...
Handle multilingual translation tasks with quality and cultural sensitivity
Internationalization and localization practices for multilingual applications.
Framework for state-of-the-art sentence, text, and image embeddings. Provides 5000+ pre-trained models for semantic similarity, clustering, and retrieval. Supports multilingual, domain-specific,...
Framework for state-of-the-art sentence, text, and image embeddings. Provides 5000+ pre-trained models for semantic similarity, clustering, and retrieval. Supports multilingual, domain-specific,...
Language-independent tokenizer treating text as raw Unicode. Supports BPE and Unigram algorithms. Fast (50k sentences/sec), lightweight (6MB memory), deterministic vocabulary. Used by T5, ALBERT,...
Language-independent tokenizer treating text as raw Unicode. Supports BPE and Unigram algorithms. Fast (50k sentences/sec), lightweight (6MB memory), deterministic vocabulary. Used by T5, ALBERT,...
Internationalization and localization patterns for multi-language applications. Use when implementing translation systems, locale-specific formatting, RTL layouts, or managing language switching....
Design principles from Bret Victor's "Magic Ink" for building information software. Use when designing dashboards, search results, calendars, finance apps, or any interface where users seek...
Workflow for translating entire books using AI agents. Split markdown books into chapters, batch translate 5 at a time, and verify completeness. Works with any language pair.
Add screenshots to a PR using browser automation. Captures UI, saves to docs/screenshot/, and updates PR body with GitHub CDN URLs.
>
Local speech-to-text using faster-whisper. 4-6x faster than OpenAI Whisper with identical accuracy; GPU acceleration enables ~20x realtime transcription. Supports standard and distilled models...
Expert in internationalization (i18n), multi-language support, and localization
Text-to-Speech using Doubao (Volcano Engine) API. Use when converting text to natural-sounding speech, generating audio files from text, listing available TTS voices, or synthesizing speech with...
Select and optimize embedding models for semantic search and RAG applications. Use when choosing embedding models, implementing chunking strategies, or optimizing embedding quality for specific domains.