MK.IO
how-to
AI Features
Multi-language VOD dubbing

Multi-language VOD Dubbing Guide

This guide walks you through MK.IO's AI dubbing pipeline to automatically generate Spanish, German, and French audio tracks from English-language video content. At the end, your source asset streams with all four audio tracks (original English plus three dubbed languages) and be available for viewer selection.

Requirements and limitations: VOD dubbing supports MP4 content only. Source assets must be encoded using the mp4-v4 format, which produces a .mpd DASH manifest. Assets encoded with older presets that produce a .ismc manifest are not supported. For a complete list of supported languages and specifications, see the AI workflows documentation.

Prerequisites

  • Active MK.IO project access
  • API token: available in Organisation SettingsAPI Tokens
  • English-language MP4 video (2–10 minutes recommended for testing)
  • Azure Storage account connected to MK.IO

Starting point: If you already have an mp4-v4 encoded asset with a .mpd manifest, skip to Step 2.


Step 1: Upload and encode source video

1.1 Create an asset

  1. In the MK.IO dashboard, navigate to Assets+ Add Asset.
  2. Select your storage location.
  3. Enter the asset details:
    • Asset name: english-source-video
    • Container: videos
    • Storage account: select your Azure Storage account
  4. Upload your MP4 file. This guide uses a file named english-video-demo.mp4.
  5. Select Upload and wait for it to complete.

1.2 Create an encoding transform and job

  1. Navigate to Video ProcessingTransforms+ Create Transform.
  2. Configure the transform:
    • Name: encode-streaming
    • Type: Encoding
    • Preset: H.264 Multiple Bitrate 1080p
  3. Select Create.
  4. Navigate to Video ProcessingJobs+ Create Job.
  5. Configure the job:
    • Name: encode-english-source
    • Transform: encode-streaming
    • Input asset name: english-source-video — select english-video-demo.mp4
    • Output asset name: english-encoded
  6. Select Create and monitor the job status.
Encoding job status view in MK.IO dashboard
  1. Wait for the job status to show Finished.

Why encode first? Encoding generates the .mpd manifest file that track insertion operations require.


Step 2: Create multi-language dubs

2.1 Create a dubbing transform

A dubbing transform defines the source language and the target languages for the AI dubbing pipeline.

ParameterDescription
@odata.typeMust be set to #MediaKind.AIPipelinePreset
pipeline namePredefined_ACSVodSpeechToSpeech
languageSource language code (for example, en-US)
targetLanguagesArray of target language codes
speakerCountNumber of speakers in the source audio (auto for automatic detection)
personalVoicetrue to preserve the original speaker's voice characteristics
  1. Navigate to Video ProcessingTransforms+ Create Transform.
  2. Enter a Transform name, for example dubbing-transform.
  3. Select AI workflow as the transform type.
  4. Select Predefined_ACSVodSpeechToSpeech from the AI pipeline dropdown.
  5. Configure the pipeline settings:
    • Language: en-US
    • Translate to: select es-ES, de-DE, and fr-FR
    • Speaker count: auto
    • Personal voice: leave unchecked to use synthetic voices
  6. Select Create.

Notes on the configuration:

  • Target languages: A single dubbing job generates all three language dubs simultaneously.
  • Personal voice: Setting personalVoice to false uses a synthetic voice. Set to true to attempt to preserve each speaker's voice characteristics across languages.

2.2 Create a dubbing job

  1. Navigate to Video ProcessingJobs+ Create Job.
  2. Enter a Job name, for example dub-english-source.
  3. Under Select a transform, choose dubbing-transform.
  4. Under Select input, set:
    • Input asset name: english-encoded
    • Filename: english-video-demo_320x180_400k.mp4
  5. Under Configure output, set:
    • Asset storage account: select your Azure Storage account
    • Output asset name: enter a new name, for example dubbed-audio — MK.IO creates this asset automatically
  6. Select Create.

Input file: Specify any single encoded bitrate variant from the source asset (for example, english-video-demo_320x180_400k.mp4). All variants contain the audio track required for dubbing — the lowest bitrate file is fine.

Monitor progress: Navigate to Video ProcessingJobs and wait for the job status to show Finished.

Output files: When the job completes, the dubbed-audio asset contains three files:

  • english-video-demo_320x180_400k.mp4_es-ES.mp4 — Spanish
  • english-video-demo_320x180_400k.mp4_de-DE.mp4 — German
  • english-video-demo_320x180_400k.mp4_fr-FR.mp4 — French

Step 3: Insert audio tracks

This step adds the dubbed audio tracks to the encoded video asset, making all languages available to viewers.

3.1 Create track insertion transforms

Track insertion transforms must be created via the API. Track insertion jobs can be created using either the UI or API — see step 3.2.

Create three transforms — one per language. Each transform defines the track name, display name, and language code for the inserted audio.

Spanish insert transform:

PUThttps://api.mk.io/api/v1/projects/<YOUR_PROJECT_NAME>/media/transforms/<TRANSFORM_NAME>
Path Parameters
<YOUR_PROJECT_NAME>Your unique project identifier
<TRANSFORM_NAME>Transform name (for example, spanish-insert)
Request Body
{
  "properties": {
    "description": "Insert Spanish audio",
    "outputs": [
      {
        "preset": {
          "tracks": [
            {
              "@odata.type": "#MediaKind.AudioTrack",
              "trackName": "audio-spanish",
              "displayName": "Español (AI Dubbed)",
              "languageCode": "es-ES"
            }
          ],
          "@odata.type": "#MediaKind.TrackInserterPreset"
        },
        "relativePriority": "Normal"
      }
    ]
  }
}

Repeat for German and French:

  • German transform: set trackName to audio-german, displayName to Deutsch (AI Dubbed), and languageCode to de-DE
  • French transform: set trackName to audio-french, displayName to Français (AI Dubbed), and languageCode to fr-FR

3.2 Create track insertion jobs

Create one job per language. Each job inserts the corresponding dubbed audio file into the english-encoded asset.

Repeat the following for each language (Spanish, German, French):

  1. Navigate to Video ProcessingJobs+ Create Job.
  2. Enter a Job name, for example job-insert-spanish.
  3. Under Select a transform, choose the corresponding insertion transform (for example, spanish-insert).
  4. Under Select input, set:
    • Input asset name: dubbed-audio
    • Filename: the dubbed file for this language (for example, english-video-demo_320x180_400k.mp4_es-ES.mp4)
  5. Under Configure output, set:
    • Output asset name: english-encoded
  6. Select Create.
LanguageTransformInput filenameOutput asset
Spanishspanish-insert..._es-ES.mp4english-encoded
Germangerman-insert..._de-DE.mp4english-encoded
Frenchfrench-insert..._fr-FR.mp4english-encoded

Monitor progress: Wait for all three jobs to show Finished.

Verify: Navigate to Assetsenglish-encoded and open the Tracks section. You should see three audio tracks for Spanish, German, and French.

Asset tracks view showing Spanish, German, and French dubbed audio tracks

Step 4: Configure streaming

4.1 Create a streaming endpoint

  1. Navigate to Streaming Endpoints+ Create Streaming Endpoint.
  2. Configure the endpoint:
    • Name: production
    • Base URL: content
    • Type: Dedicated
  3. Select Create, then Start.

4.2 Create a streaming locator

  1. Navigate to Assets and select english-encoded.
  2. Select the production endpoint created in step 4.1.
  3. Add a streaming locator:
    • Name: live
    • Policy: Predefined_DownloadAndClearStreaming
  4. Copy the playback URLs provided.

4.3 Test multi-language playback

  1. Select the embedded player in the asset details.
  2. Use the audio track selector to switch between languages.
  3. Confirm that Spanish, German, and French audio tracks are selectable alongside the original English track.