Multi-language VOD Dubbing Guide
This guide walks you through MK.IO's AI dubbing pipeline to automatically generate Spanish, German, and French audio tracks from English-language video content. At the end, your source asset streams with all four audio tracks (original English plus three dubbed languages) and be available for viewer selection.
Requirements and limitations: VOD dubbing supports MP4 content only. Source assets must be encoded using the mp4-v4 format, which produces a .mpd DASH manifest. Assets encoded with older presets that produce a .ismc manifest are not supported. For a complete list of supported languages and specifications, see the AI workflows documentation.
Prerequisites
- Active MK.IO project access
- API token: available in Organisation Settings → API Tokens
- English-language MP4 video (2–10 minutes recommended for testing)
- Azure Storage account connected to MK.IO
Starting point: If you already have an mp4-v4 encoded asset with a .mpd manifest, skip to Step 2.
Step 1: Upload and encode source video
1.1 Create an asset
- In the MK.IO dashboard, navigate to Assets → + Add Asset.
- Select your storage location.
- Enter the asset details:
- Asset name:
english-source-video - Container:
videos - Storage account: select your Azure Storage account
- Asset name:
- Upload your MP4 file. This guide uses a file named
english-video-demo.mp4. - Select Upload and wait for it to complete.
1.2 Create an encoding transform and job
- Navigate to Video Processing → Transforms → + Create Transform.
- Configure the transform:
- Name:
encode-streaming - Type: Encoding
- Preset: H.264 Multiple Bitrate 1080p
- Name:
- Select Create.
- Navigate to Video Processing → Jobs → + Create Job.
- Configure the job:
- Name:
encode-english-source - Transform:
encode-streaming - Input asset name:
english-source-video— selectenglish-video-demo.mp4 - Output asset name:
english-encoded
- Name:
- Select Create and monitor the job status.
- Wait for the job status to show Finished.
Why encode first? Encoding generates the .mpd manifest file that track insertion operations require.
Step 2: Create multi-language dubs
2.1 Create a dubbing transform
A dubbing transform defines the source language and the target languages for the AI dubbing pipeline.
| Parameter | Description |
|---|---|
@odata.type | Must be set to #MediaKind.AIPipelinePreset |
pipeline name | Predefined_ACSVodSpeechToSpeech |
language | Source language code (for example, en-US) |
targetLanguages | Array of target language codes |
speakerCount | Number of speakers in the source audio (auto for automatic detection) |
personalVoice | true to preserve the original speaker's voice characteristics |
- Navigate to Video Processing → Transforms → + Create Transform.
- Enter a Transform name, for example
dubbing-transform. - Select AI workflow as the transform type.
- Select Predefined_ACSVodSpeechToSpeech from the AI pipeline dropdown.
- Configure the pipeline settings:
- Language:
en-US - Translate to: select
es-ES,de-DE, andfr-FR - Speaker count:
auto - Personal voice: leave unchecked to use synthetic voices
- Language:
- Select Create.
Notes on the configuration:
- Target languages: A single dubbing job generates all three language dubs simultaneously.
- Personal voice: Setting
personalVoicetofalseuses a synthetic voice. Set totrueto attempt to preserve each speaker's voice characteristics across languages.
2.2 Create a dubbing job
- Navigate to Video Processing → Jobs → + Create Job.
- Enter a Job name, for example
dub-english-source. - Under Select a transform, choose
dubbing-transform. - Under Select input, set:
- Input asset name:
english-encoded - Filename:
english-video-demo_320x180_400k.mp4
- Input asset name:
- Under Configure output, set:
- Asset storage account: select your Azure Storage account
- Output asset name: enter a new name, for example
dubbed-audio— MK.IO creates this asset automatically
- Select Create.
Input file: Specify any single encoded bitrate variant from the source asset (for example, english-video-demo_320x180_400k.mp4). All variants contain the audio track required for dubbing — the lowest bitrate file is fine.
Monitor progress: Navigate to Video Processing → Jobs and wait for the job status to show Finished.
Output files: When the job completes, the dubbed-audio asset contains three files:
english-video-demo_320x180_400k.mp4_es-ES.mp4— Spanishenglish-video-demo_320x180_400k.mp4_de-DE.mp4— Germanenglish-video-demo_320x180_400k.mp4_fr-FR.mp4— French
Step 3: Insert audio tracks
This step adds the dubbed audio tracks to the encoded video asset, making all languages available to viewers.
3.1 Create track insertion transforms
Track insertion transforms must be created via the API. Track insertion jobs can be created using either the UI or API — see step 3.2.
Create three transforms — one per language. Each transform defines the track name, display name, and language code for the inserted audio.
Spanish insert transform:
https://api.mk.io/api/v1/projects/<YOUR_PROJECT_NAME>/media/transforms/<TRANSFORM_NAME><YOUR_PROJECT_NAME>Your unique project identifier<TRANSFORM_NAME>Transform name (for example, spanish-insert){
"properties": {
"description": "Insert Spanish audio",
"outputs": [
{
"preset": {
"tracks": [
{
"@odata.type": "#MediaKind.AudioTrack",
"trackName": "audio-spanish",
"displayName": "Español (AI Dubbed)",
"languageCode": "es-ES"
}
],
"@odata.type": "#MediaKind.TrackInserterPreset"
},
"relativePriority": "Normal"
}
]
}
}Repeat for German and French:
- German transform: set
trackNametoaudio-german,displayNametoDeutsch (AI Dubbed), andlanguageCodetode-DE - French transform: set
trackNametoaudio-french,displayNametoFrançais (AI Dubbed), andlanguageCodetofr-FR
3.2 Create track insertion jobs
Create one job per language. Each job inserts the corresponding dubbed audio file into the english-encoded asset.
Repeat the following for each language (Spanish, German, French):
- Navigate to Video Processing → Jobs → + Create Job.
- Enter a Job name, for example
job-insert-spanish. - Under Select a transform, choose the corresponding insertion transform (for example,
spanish-insert). - Under Select input, set:
- Input asset name:
dubbed-audio - Filename: the dubbed file for this language (for example,
english-video-demo_320x180_400k.mp4_es-ES.mp4)
- Input asset name:
- Under Configure output, set:
- Output asset name:
english-encoded
- Output asset name:
- Select Create.
| Language | Transform | Input filename | Output asset |
|---|---|---|---|
| Spanish | spanish-insert | ..._es-ES.mp4 | english-encoded |
| German | german-insert | ..._de-DE.mp4 | english-encoded |
| French | french-insert | ..._fr-FR.mp4 | english-encoded |
Monitor progress: Wait for all three jobs to show Finished.
Verify: Navigate to Assets → english-encoded and open the Tracks section. You should see three audio tracks for Spanish, German, and French.
Step 4: Configure streaming
4.1 Create a streaming endpoint
- Navigate to Streaming Endpoints → + Create Streaming Endpoint.
- Configure the endpoint:
- Name:
production - Base URL:
content - Type: Dedicated
- Name:
- Select Create, then Start.
4.2 Create a streaming locator
- Navigate to Assets and select
english-encoded. - Select the
productionendpoint created in step 4.1. - Add a streaming locator:
- Name:
live - Policy:
Predefined_DownloadAndClearStreaming
- Name:
- Copy the playback URLs provided.
4.3 Test multi-language playback
- Select the embedded player in the asset details.
- Use the audio track selector to switch between languages.
- Confirm that Spanish, German, and French audio tracks are selectable alongside the original English track.