P-Video Avatar - The Essentials

Last updated: April 29, 2026

P-Video Avatar by Pruna AI turns any portrait image into a talking avatar video. Upload a photo, provide a script or your own audio, and the model generates a lip-synced video in seconds.

Overview

This model does one thing very well: it makes a still portrait speak. You give it a face and some audio, and it returns a video where the character's mouth moves in sync with the words. It works equally well with a photorealistic person, an illustrated character, or a stylized avatar.

There are two ways to work with it. The fastest path is writing a voice script directly in the interface and letting the model handle the speech synthesis. It has 30 built-in voices and supports 10 languages, so you can go from script to video without any external tools. If you need a specific voice or already have a recorded narration, you can upload your own audio file instead and the model will sync the lip movements to it.

What It Does

Animates any portrait with accurate lip sync matched to the audio
Synthesizes speech from a written script using 30 built-in voices and 10 languages
Accepts a custom audio upload for full control over voice and delivery
Lets you shape tone, pacing, and emotion through a Voice Prompt
Outputs in 720p or 1080p

How to Use It

Portrait Image

The starting point for every generation. Use a clean, well-lit portrait where the face is clearly visible and looking roughly toward the camera. A cropped headshot or bust shot works best. Extreme side angles or heavy shadows will reduce lip sync quality.

Voice Script vs. Custom Audio

These are the two mutually exclusive ways to provide audio. If you upload a file, it takes priority and the Voice Script is ignored. Use the built-in Voice Script when you want to iterate quickly, generate in multiple languages, or keep the workflow fully inside Scenario. Use custom audio when you need a specific voice, recorded narration, or a language not in the built-in list.

Voice Prompt

This controls how the Voice Script is delivered, not what it says. Use it to set the emotional tone and pacing of the speech. It has no effect when you upload custom audio.

Examples of what to write here:

Warm and conversational. Slow down on key points. Sound like you are talking to a friend.
Confident and authoritative. Clear enunciation. Short pauses between sentences.
Enthusiastic and energetic. Slightly faster pace to build excitement.

Video Prompt

Optional. Controls the visual environment around the avatar: background, lighting, and mood. If you leave it empty, the model picks sensible defaults. Use it when you need a specific setting or look.

Examples:

Professional office background, soft natural light, shallow depth of field, photorealistic
Clean white studio, even lighting, neutral and sharp
Outdoor setting, warm golden hour light, slightly blurred background

Voice and Language

30 voices are available (14 female, 16 male) and 10 languages: English US, English UK, Spanish, French, German, Italian, Portuguese (Brazil), Japanese, Korean, and Hindi. Always set the language to match the language of your script. A mismatch produces broken pronunciation.

Examples

Game character introduction

A fantasy RPG character greeting the player at the start of a quest.

Portrait: Illustrated fantasy warrior, frontal, neutral background
Voice Script: Brave traveler. I have waited long for someone worthy of this quest. The shadow grows in the east, and only you can stop it. Are you ready?
Voice Prompt: Deep, serious tone. Slow and deliberate. Slight gravelly quality. As if speaking to a hero.
Voice: Algenib (Male)
Video Prompt: Dark stone castle interior, flickering torchlight, dramatic shadows

Product walkthrough

A spokesperson explaining a product feature for a marketing video.

Portrait: Professional photo of a person in business attire
Voice Script: With our new dashboard, you can track every project in real time. No more spreadsheets. No more missed deadlines. Just clarity, all in one place.
Voice Prompt: Confident and friendly. Conversational pace. Smile in the voice.
Voice: Despina (Female)
Video Prompt: Modern office background, clean and bright, soft natural light

Multilingual localization

The same character delivering the same message in Spanish for a Latin American audience.

Portrait: Same portrait as the original video
Voice Script: Con nuestro nuevo panel, puedes seguir cada proyecto en tiempo real. Sin hojas de calculo. Sin plazos perdidos.
Voice Language: Spanish
Voice: Laomedeia (Female)

Custom audio with recorded narration

An animated character avatar for a podcast intro, using a pre-recorded voiceover.

Portrait: Stylized illustrated character, facing forward
Audio: Uploaded MP3 of the recorded podcast intro (clean, voice-only, no background music)
Video Prompt: Podcast studio environment, warm ambient lighting, microphone visible in background

Tips for Better Results

Test with a short script before committing to a long one. Run a 5 to 10 second version first to verify the voice, tone, and lip sync quality. It is much cheaper to adjust before scaling up.
Write your script to be spoken, not read. Use short sentences. Spell out numbers and abbreviations. Use punctuation deliberately to control pacing. If you would stumble reading it aloud, simplify it.
For custom audio, use a clean voice-only recording. Background music or noise in the audio file interferes with lip sync. Record a dry voiceover first, then add music as a separate step after the avatar video is done.
Generate at 720p while iterating, 1080p for finals. 720p is faster and sufficient to evaluate the voice, script, and visual result. Switch to 1080p only when you have confirmed the output is ready.
Use Seed to reproduce a result. Once you have a generation you like, copy the seed and reuse it with small changes to the script or prompt. This keeps iterations focused and predictable.

Known Limitations

Frontal faces only. Strong side profiles or extreme head angles produce inconsistent lip sync. The face needs to be roughly facing the camera.
Custom audio overrides Voice Script entirely. When an audio file is uploaded, the Voice Script and Voice Prompt are both ignored. There is no way to blend the two.
Voice Prompt has no effect with custom audio. Tone and delivery can only be controlled through the Voice Prompt when using built-in voice synthesis.
Maximum resolution is 1080p. For higher-resolution output, generate at 1080p and run the result through a dedicated upscaler as a separate step.