ai-avatar-video

by @inference-sh in AI & LLM

# Install this skill:

npx skills add inference-sh/skills --skill "ai-avatar-video"

Install specific skill from multi-skill repository

# Description

# SKILL.md

name: ai-avatar-video
description: |
Create AI avatar and talking head videos with OmniHuman, Fabric, PixVerse via inference.sh CLI.
Models: OmniHuman 1.5, OmniHuman 1.0, Fabric 1.0, PixVerse Lipsync.
Capabilities: audio-driven avatars, lipsync videos, talking head generation, virtual presenters.
Use for: AI presenters, explainer videos, virtual influencers, dubbing, marketing videos.
Triggers: ai avatar, talking head, lipsync, avatar video, virtual presenter,
ai spokesperson, audio driven video, heygen alternative, synthesia alternative,
talking avatar, lip sync, video avatar, ai presenter, digital human
allowed-tools: Bash(infsh *)

AI Avatar & Talking Head Videos

Create AI avatars and talking head videos via inference.sh CLI.

Quick Start

curl -fsSL https://cli.inference.sh | sh && infsh login

# Create avatar video from image + audio
infsh app run bytedance/omnihuman-1-5 --input '{
  "image_url": "https://portrait.jpg",
  "audio_url": "https://speech.mp3"
}'

Available Models

Model	App ID	Best For
OmniHuman 1.5	`bytedance/omnihuman-1-5`	Multi-character, best quality
OmniHuman 1.0	`bytedance/omnihuman-1-0`	Single character
Fabric 1.0	`falai/fabric-1-0`	Image talks with lipsync
PixVerse Lipsync	`falai/pixverse-lipsync`	Highly realistic

Search Avatar Apps

infsh app list --search "omnihuman"
infsh app list --search "lipsync"
infsh app list --search "fabric"

Examples

OmniHuman 1.5 (Multi-Character)

infsh app run bytedance/omnihuman-1-5 --input '{
  "image_url": "https://portrait.jpg",
  "audio_url": "https://speech.mp3"
}'

Supports specifying which character to drive in multi-person images.

Fabric 1.0 (Image Talks)

infsh app run falai/fabric-1-0 --input '{
  "image_url": "https://face.jpg",
  "audio_url": "https://audio.mp3"
}'

PixVerse Lipsync

infsh app run falai/pixverse-lipsync --input '{
  "image_url": "https://portrait.jpg",
  "audio_url": "https://speech.mp3"
}'

Generates highly realistic lipsync from any audio.

Full Workflow: TTS + Avatar

# 1. Generate speech from text
infsh app run infsh/kokoro-tts --input '{
  "text": "Welcome to our product demo. Today I will show you..."
}' > speech.json

# 2. Create avatar video with the speech
infsh app run bytedance/omnihuman-1-5 --input '{
  "image_url": "https://presenter-photo.jpg",
  "audio_url": "<audio-url-from-step-1>"
}'

Full Workflow: Dub Video in Another Language

# 1. Transcribe original video
infsh app run infsh/fast-whisper-large-v3 --input '{"audio_url": "https://video.mp4"}' > transcript.json

# 2. Translate text (manually or with an LLM)

# 3. Generate speech in new language
infsh app run infsh/kokoro-tts --input '{"text": "<translated-text>"}' > new_speech.json

# 4. Lipsync the original video with new audio
infsh app run infsh/latentsync-1-6 --input '{
  "video_url": "https://original-video.mp4",
  "audio_url": "<new-audio-url>"
}'

Use Cases

Marketing: Product demos with AI presenter
Education: Course videos, explainers
Localization: Dub content in multiple languages
Social Media: Consistent virtual influencer
Corporate: Training videos, announcements

Tips

Use high-quality portrait photos (front-facing, good lighting)
Audio should be clear with minimal background noise
OmniHuman 1.5 supports multiple people in one image
LatentSync is best for syncing existing videos to new audio

# Full platform skill (all 150+ apps)
npx skills add inference-sh/skills@inference-sh

# Text-to-speech (generate audio for avatars)
npx skills add inference-sh/skills@text-to-speech

# Speech-to-text (transcribe for dubbing)
npx skills add inference-sh/skills@speech-to-text

# Video generation
npx skills add inference-sh/skills@ai-video-generation

# Image generation (create avatar images)
npx skills add inference-sh/skills@ai-image-generation

Browse all video apps: infsh app list --category video

# Supported AI Coding Agents

This skill is compatible with the SKILL.md standard and works with all major AI coding agents:

⚡ Amp 🚀 Antigravity 🤖 Claude Code 🦀 Clawdbot 📝 Codex ▶️ Cursor 🤖 Droid 💎 Gemini CLI 🐙 GitHub Copilot 🪿 Goose 📊 Kilo Code 🔧 Kiro CLI 💻 OpenCode 🦘 Roo Code 🌲 Trae 🏄 Windsurf

Learn more about the SKILL.md standard and how to use these skills with your preferred AI coding agent.