Introduction

The MM Audio 2 suite represents a sophisticated leap in audio generation within Scenario. Released in December 2024, these models allow creators to bridge the gap between silent visuals and immersive soundscapes through high-fidelity, synchronized audio production.
MM Audio 2 (Video-to-Audio)
MM Audio 2 is a specialized engine designed to generate perfectly synchronized soundtracks for silent footage. By analyzing visual cues and optional text prompts, it creates immersive, realistic audio experiences with cinematic precision and temporal accuracy.
Core Function: Breathes life into digital content by adding audio that matches the motion and context of an uploaded video.
Visual Analysis: The engine evaluates visual elements to ensure sound effects occur at the exact moment they are seen on screen.
MM Audio 2 Text-To-Audio (SFX)
For projects starting without video, MM Audio 2 Text-To-Audio (SFX) serves as an advanced generator that transforms descriptive prompts into realistic sound effects.
Versatility: Instantly generates soundscapes for games, animation, and film.
Creative Detail: By simply detailing a specific scene or action, creators can bridge the gap between imagination and a finished audio asset.
Understanding the Parameters
Both MM Audio 2 models utilize a shared set of controls to fine-tune the resulting audio output:
Duration: Set the length of the generated audio clip.
Steps: Determines the number of iterations for the audio generation; the standard default is 25.
Guidance: Controls how strictly the model follows your prompt or visual cues, with a standard default of 4.5.
Mask Away Clip: A toggle used to refine the audio focus during generation.
Seed: Allows for reproducible results or slight variations by using a specific numerical identifier.
Was this helpful?