MiniMax Music: The Essentials
Last updated: April 20, 2026
Covers MiniMax Music 2.6 (model_minimax-music-2-6) and MiniMax Music 2.5 (model_minimax-music-2-5)

MiniMax Music generates complete, mixed songs from text: lyrics, instrumentation, vocals, and mastering in a single generation. The 2.6 model is the current recommended version and adds instrumental mode, auto-generated lyrics, and a required style prompt alongside the optional lyrics field. The 2.5 model remains available for workflows that require lyrics to be provided explicitly.
MiniMax Music 2.0 has been deprecated. It has been succeeded by 2.5 and 2.6. If you are using model_minimax-music-2-0, migrate to model_minimax-music-2-6.
Which Model Should I Use?
ModelIDLyrics fieldBest for | |||
| Optional. Can be empty, provided, or auto-generated from the prompt. | All use cases. Supports instrumental mode and auto-lyrics. The default choice for new projects. | |
| Required. Must always be provided. | Workflows where lyrics are always written manually and the explicit requirement is preferred for consistency. |
Use 2.6 for all new work. The key upgrade over 2.5 is flexibility: you can generate a fully instrumental track without providing lyrics, or let the model write lyrics for you from the style prompt. The structure tag system and audio quality settings are identical between versions.
Parameters
MiniMax Music 2.6
The Style Prompt is required and accepts up to 2,000 characters. It defines the sonic character of the track: genre, mood, tempo in BPM, key, vocal style, and instrumentation. For instrumental tracks, this is the only required field.
Lyrics is optional in 2.6. Write the song's text and structure here, using structure tags like [Verse] and [Chorus] alongside the lyric lines. Leave it empty if you are using Auto Lyrics or Instrumental Mode.
Instrumental Mode removes all vocals from the output. When enabled, the Style Prompt drives the entire generation. Do not provide lyrics when this is active.
Auto Lyrics tells the model to write lyrics automatically from the Style Prompt. Activate it with an empty Lyrics field. This cannot be used at the same time as Instrumental Mode.
Sample Rate sets the audio quality. Options are 16,000, 24,000, 32,000, and 44,100 Hz. The default is 44,100 Hz (CD quality).
Bitrate sets the file quality. Options are 32,000, 64,000, 128,000, and 256,000 bps. The default is 256,000 bps, the highest available option.
MiniMax Music 2.5
Lyrics is required in 2.5 and must always be provided. Write the song's text and structure tags here. The model will not run without it.
Style Prompt is optional and supplements the Lyrics field with genre, mood, tempo, and instrumentation context.
Sample Rate and Bitrate work the same as in 2.6.
Structure Tags
Both models use structure tags embedded in the lyrics field to define the song's arrangement. Tags tell the model where each section begins and what type of section it is. The model uses this blueprint to vary energy, instrumentation, and vocal style accordingly.
TagSection type | |
| Opening instrumental or mood-setting section |
| Narrative section with moderate energy |
| Build-up to the hook, rising energy |
| High-energy emotional peak, the main hook |
| Short, repeated melodic phrase (often used in pop and hip hop) |
| Contrasting section before the final chorus |
| Instrumental break, typically a featured instrument |
| Rising intensity leading into a drop or chorus |
| High-energy release after a build, common in EDM |
| Instrumental passage, no vocals |
| Brief transition or atmospheric pause between sections |
| Rhythmic or dynamic break, stripped-back moment |
| Short connector between two sections |
| Closing section, resolves or fades the track |
Use line breaks (\n) between lyric lines within a section. Use double line breaks (\n\n) for longer pauses between sections.
Instrumental Mode (2.6 only)
Set isInstrumental: true in MiniMax Music 2.6 to generate a full track with no vocals. The prompt field drives the entire generation. Do not provide a lyrics value when using this mode.
Instrumental example (2.6):
prompt: "Cinematic orchestral, slow build, 60 BPM, strings and brass, epic and emotional"
isInstrumental: true
sampleRate: 44100
bitrate: 256000
This mode replaces the older workaround of placing [Inst] tags throughout the lyrics field to suppress vocals. The isInstrumental flag is more reliable and produces cleaner results.
Auto Lyrics (2.6 only)
Set lyricsOptimizer: true with an empty lyrics field in MiniMax Music 2.6 to have the model generate lyrics directly from the style prompt. The generated lyrics will follow the genre and mood described in prompt and will include appropriate structure tags.
Auto-lyrics example (2.6):
prompt: "Upbeat pop, 120 BPM, major key, summer anthem about chasing dreams"
lyrics: ""
lyricsOptimizer: true
Do not set both lyricsOptimizer: true and isInstrumental: true at the same time. These modes are mutually exclusive. Auto-lyrics generates vocals; instrumental mode suppresses them.
The Parentheses Rule
Any instruction you write inside the lyrics field that is not enclosed in parentheses will be interpreted as text to be sung. This is the most common source of unexpected output.
What you writeWhat the model does | |
| Attempts to sing "guitar solo here" as a lyric line |
| Interprets it as a performance instruction and plays a guitar solo |
| Signals a solo section via structure tag (preferred for section-level instructions) |
Use parentheses for inline performance notes within a section. Use structure tags for section-level arrangement cues.
Correct usage:
[Chorus]
We're burning bright tonight
The sky is ours to own
(key change, more intense)
Nothing can bring us down
[Solo]
(epic guitar solo, 8 bars)
[Outro]
We're burning bright (echoing vocals, fade out)
Use Cases
Game soundtracks: Use
isInstrumental: truewith a genre and mood prompt for background music. Generate multiple variations with different structure tags to cover different game states (exploration, combat, menu, victory).Content creation and social media: Use
lyricsOptimizerfor quick vocal tracks. Provide a mood and topic in the prompt and let the model handle lyrics and arrangement for short-form video and campaign content.Music production prototyping: Generate a full arrangement to test whether a genre and mood concept works before investing in a full production. The 2.5 to 4.5 minute output gives enough material to evaluate the full arc of the song.
Film and video scoring: Instrumental mode combined with a detailed style prompt can produce cues for scenes, trailers, and title sequences. Use structure tags to control where the energy peaks.
Educational and explainer content: Generate branded background music with no vocal interference. Instrumental mode at lower bitrate keeps file sizes small for web and e-learning delivery.
Advertising: Short, genre-specific instrumental tracks for product videos and ads. Use the
promptfield to match brand tone and set tempo to align with the video's cut rhythm.
Tips for Better Results
Always provide a detailed style prompt. In 2.6, the
promptfield is required and drives the entire sonic character. Include genre, mood, BPM, key, vocal style, and key instruments for the most accurate output.Use specific sub-genres over broad labels. "Synthwave with analog arpeggios and gated reverb drums" produces more targeted results than "electronic." The model responds well to production-specific vocabulary.
Keep lyric sections to 2 to 4 lines. Longer sections give the model too much material to fit into a single musical phrase and can produce rushed or melodically awkward results.
Specify BPM for rhythm-dependent genres. Hip hop, EDM, and dance tracks benefit greatly from explicit tempo. "Hip hop, 90 BPM" aligns the beat pattern to a specific groove in a way that "hip hop" alone does not.
Use isInstrumental rather than [Inst] tags for fully vocal-free tracks. Scattering
[Inst]tags throughout a lyrics field is an older workaround. TheisInstrumental: trueflag in 2.6 is cleaner and more reliable.Change one variable at a time when iterating. If the output is close but not right, adjust a single element (tempo, mood word, or one structure tag) per run rather than rewriting the whole prompt. This makes it easier to identify what caused the change.
Plan for track length variation. Typical output is between 2:30 and 4:45 minutes. The model does not accept an explicit duration parameter. Structure the lyrics field density to guide length: more sections and more lines generally produce longer tracks.
Known Limitations
Single mixed output only, no stems. The model delivers one MP3 file containing the full mix: vocals, instruments, and mastering. Individual instrument or vocal stems are not available.
Duration cannot be specified directly. Output length is influenced by the number of sections and lines in the lyrics field but cannot be set to an exact value. Expect variation of 30 to 60 seconds around your target length.
Instrumentation varies between runs. The same prompt and lyrics may produce different instrument choices across runs. Use a seed if you need to control variation, but note that minor differences are inherent to the model.
Best performance in English and Mandarin. Other languages are supported but may produce less natural vocal phrasing and less precise lyric pronunciation.
Lyrics field is required in 2.5. MiniMax Music 2.5 will return an error if the
lyricsfield is empty or missing. For instrumental output on 2.5, use[Inst]tags throughout the lyrics field as a workaround, or migrate to 2.6 for the cleanerisInstrumentalflag.MiniMax Music 2.0 is deprecated. The model is still visible on the platform but is no longer recommended. Migrate to 2.6 for all new work. The parameter interface is different, so prompts will require updating.