Search: audio-classification

openai-whisper-api 0.00

moltbot / moltbot-openai-whisper-api exact

Transcribe audio via OpenAI Audio Transcriptions API (Whisper).

★ 65,001 ai

ai assistant clawd own-your-data

openai-whisper-api 0.00

yueweilu / ai-agent-skills-openai-whisper-api exact

Transcribe audio via OpenAI Audio Transcriptions API (Whisper).

★ 0 ai

ASR 0.00

AnswerZhao / agent-skills-glm-skills-asr exact

Implement speech-to-text (ASR/automatic speech recognition) capabilities using the z-ai-web-dev-sdk. Use this skill when the user needs to transcribe audio files, convert speech to text, build...

★ 23 tools

clip-aware-embeddings 0.00

erichowens / some-claude-skills-clip-aware-embeddings exact

Semantic image-text matching with CLIP and alternatives. Use for image search, zero-shot classification, similarity matching. NOT for counting objects, fine-grained classification (celebrities,...

★ 20 ai

Convert PCM to WAV (see scripts/pcm_to_wav.py) 0.00

ngxtm / devkit-convert-pcm-to-wav-see-scripts-pcm-to-wav-py exact

Generate AI-powered podcast-style audio narratives using Azure OpenAI's GPT Realtime Mini model via WebSocket. Use when building text-to-speech features, audio narrative generation, podcast...

★ 0 ai

agent ai automation claude

multimodal-ai 0.00

omer-metin / skills-for-antigravity-multimodal-ai exact

Patterns for building multimodal AI applications that combine text, images, audio, and video. Covers vision APIs, audio transcription, and unified pipelines. Use when "multimodal AI, vision API,...

★ 5 ai

ai-agents antigravity antigravity-ide skills

markitdown 0.00

jackspace / claudeskillz-markitdown exact

Convert various file formats (PDF, Office documents, images, audio, web content, structured data) to Markdown optimized for LLM processing. Use when converting documents to markdown, extracting...

★ 8 ai

agentic-coding ai-skills automation bioinformatics

markitdown 0.00

0xbeedao / agentic-tools-markitdown exact

Convert various file formats (PDF, Office documents, images, audio, web content, structured data) to Markdown optimized for LLM processing. Use when converting documents to markdown, extracting...

★ 0 data

markitdown 0.00

ovachiever / droid-tings-markitdown exact

Convert various file formats (PDF, Office documents, images, audio, web content, structured data) to Markdown optimized for LLM processing. Use when converting documents to markdown, extracting...

★ 19 data

esphome-box3-builder 0.00

nodnarbnitram / claude-code-extensions-esphome-box3-builder exact

This skill should be used when the user asks to "configure esp32-s3-box-3", "set up box-3", "create box-3 voice assistant", "display lambda on box-3", "configure ili9xxx display", "set up gt911...

★ 3 development

ai-multimodal 0.00

samhvw8 / dot-claude-ai-multimodal exact

Multimodal AI processing via Google Gemini API (2M tokens context). Capabilities: audio (transcription, 9.5hr max, summarization, music analysis), images (captioning, OCR, object detection,...

★ 5 data

good-TTvideo2text 0.00

ImGoodBai / goodable-good-ttvideo2text exact

Extract audio from short videos (Douyin/TikTok) and transcribe to text with timestamps. Use when user provides video URL and needs audio transcription.

★ 84 ai

base44 claudecode codeagent lovable

text-to-speech 0.00

martinholovsky / claude-skills-generator-text-to-speech exact

Expert skill for implementing text-to-speech with Kokoro TTS. Covers voice synthesis, audio generation, performance optimization, and secure handling of generated audio for JARVIS voice assistant.

★ 20 tools

elevenlabs 0.00

digitalsamba / claude-code-video-toolkit-elevenlabs exact

Generate AI voiceovers, sound effects, and music using ElevenLabs APIs. Use when creating audio content for videos, podcasts, or games. Triggers include generating voiceovers, narration, dialogue,...

★ 23 ai

ai-video-generator claude-code developer-tools elevenlabs

video-transcript-downloader 0.00

steipete / agent-scripts-video-transcript-downloader exact

Download videos, audio, subtitles, and clean paragraph-style transcripts from YouTube and any other yt-dlp supported site. Use when asked to “download this video”, “save this clip”, “rip audio”,...

★ 1,503 ai

ai-agents

video-transcript-downloader 0.00

devskale / skale-skills-video-transcript-downloader exact

Download videos, audio, subtitles, and clean paragraph-style transcripts from YouTube and any other yt-dlp supported site. Use when asked to “download this video”, “save this clip”, “rip audio”,...

★ 0 development

Video Processor 0.00

iamzhihuix / happy-claude-skills-video-processor exact

Download and process videos from YouTube and other platforms. Supports video download, audio extraction, format conversion (mp4, webm), and Whisper transcription. Use when user mentions YouTube...

★ 241 development

video-transcript-downloader 0.00

lancenunes / codex-skills-video-transcript-downloader exact

Download videos, audio, subtitles, and clean paragraph-style transcripts from YouTube and any other yt-dlp supported site. Use when asked to “download this video”, “save this clip”, “rip audio”,...

★ 1 ai

agent-skills agents ai-development ai-tools

yt-dlp-downloader 0.00

MapleShaw / yt-dlp-downloader-skill exact

Download videos from YouTube, Bilibili, Twitter, and thousands of other sites using yt-dlp. Use when the user provides a video URL and wants to download it, extract audio (MP3), download...

★ 141 development

nlp-engineer 0.00

404kidwiz / claude-supercode-skills-nlp-engineer exact

Expert in Natural Language Processing, designing systems for text classification, NER, translation, and LLM integration using Hugging Face, spaCy, and LangChain. Use when building NLP pipelines,...

★ 6 ai

Confirm

Submit a Skill