Categories

Audio Dialog Isolation

Isolate vocals and accompaniment using Spleeter or MDX models

Runs external separation engines (Spleeter or Demucs/MDX) and packages stems as a zip file.

Click to upload file or drag and drop file here

Maximum file size: 200MB Supported formats: audio/*

Key Facts

Category
Media
Input Types
file, select
Output Type
file
Sample Coverage
4
API Ready
Yes

Overview

Audio Dialog Isolation separates vocals and accompaniment from audio files using Spleeter or MDX models, delivering isolated stems as a zip package for easy use.

When to Use

  • When extracting vocals from music for karaoke or remixing projects.
  • When removing background music to enhance dialog clarity in podcasts or videos.
  • When isolating audio elements for analysis, restoration, or transcription tasks.

How It Works

  • Upload an audio file in a supported format like MP3 or WAV.
  • Select the separation engine: Spleeter for speed or MDX for higher quality.
  • Choose the output format for the stems, such as WAV, FLAC, or MP3.
  • The tool processes the audio and provides a zip file with the separated vocal and accompaniment tracks.

Use Cases

Creating karaoke tracks by isolating vocals from music recordings.
Enhancing dialog clarity in podcast or film production by removing background noise.
Analyzing vocal patterns in audio for research or transcription purposes.

Examples

1. Extract Vocals for Karaoke

Background
A music enthusiast wants to create a sing-along version of a favorite song.
Problem
The original audio has vocals mixed with instruments, making it unsuitable for karaoke.
How to Use
Upload the song file, select the Spleeter engine for quick processing, and choose MP3 output format.
Outcome
The tool outputs a zip file with separate vocal and instrumental tracks, ready for karaoke use.

2. Isolate Dialog in Podcast

Podcast Editor
Background
A podcast editor needs to clean up an episode where background music obscures the speaker's voice.
Problem
Background music reduces dialog clarity and professional quality.
How to Use
Upload the podcast audio, use the MDX engine for better quality separation, and output in WAV format for editing.
Outcome
The separated dialog track is clear and easy to edit, improving overall audio quality.

Try with Samples

markdown, audio, file

Related Hubs

FAQ

What audio file formats are supported?

The tool accepts common audio formats including MP3, WAV, FLAC, and others, as indicated by the audio/* file type.

What is the difference between Spleeter and MDX engines?

Spleeter offers faster processing for basic separation, while MDX (Demucs) provides higher quality results but may take longer.

Can I choose the output format for the stems?

Yes, you can select from WAV, FLAC, MP3, M4A, OGG Vorbis, or Opus formats.

How many stems are separated?

Both engines separate audio into 2 stems: vocals and accompaniment.

Is there a file size limit?

Yes, the audio file must be under 200 MB as per the file limit.

API Documentation

Request Endpoint

POST /en/api/tools/audio-dialog-isolation

Request Parameters

Parameter Name Type Required Description
audioFile file (Upload required) Yes -
engine select No -
outputFormat select No -

File type parameters need to be uploaded first via POST /upload/audio-dialog-isolation to get filePath, then pass filePath to the corresponding file field.

Response Format

{
  "filePath": "/public/processing/randomid.ext",
  "fileName": "output.ext",
  "contentType": "application/octet-stream",
  "size": 1024,
  "metadata": {
    "key": "value"
  },
  "error": "Error message (optional)",
  "message": "Notification message (optional)"
}
File: File

AI MCP Documentation

Add this tool to your MCP server configuration:

{
  "mcpServers": {
    "elysiatools-audio-dialog-isolation": {
      "name": "audio-dialog-isolation",
      "description": "Isolate vocals and accompaniment using Spleeter or MDX models",
      "baseUrl": "https://elysiatools.com/mcp/sse?toolId=audio-dialog-isolation",
      "command": "",
      "args": [],
      "env": {},
      "isActive": true,
      "type": "sse"
    }
  }
}

You can chain multiple tools, e.g.: `https://elysiatools.com/mcp/sse?toolId=png-to-webp,jpg-to-webp,gif-to-webp`, max 20 tools.

Supports URL file links or Base64 encoding for file parameters.

If you encounter any issues, please contact us at [email protected]