Key Facts
- Category
- Media
- Input Types
- file, select
- Output Type
- file
- Sample Coverage
- 4
- API Ready
- Yes
Overview
Audio Dialog Isolation separates vocals and accompaniment from audio files using Spleeter or MDX models, delivering isolated stems as a zip package for easy use.
When to Use
- •When extracting vocals from music for karaoke or remixing projects.
- •When removing background music to enhance dialog clarity in podcasts or videos.
- •When isolating audio elements for analysis, restoration, or transcription tasks.
How It Works
- •Upload an audio file in a supported format like MP3 or WAV.
- •Select the separation engine: Spleeter for speed or MDX for higher quality.
- •Choose the output format for the stems, such as WAV, FLAC, or MP3.
- •The tool processes the audio and provides a zip file with the separated vocal and accompaniment tracks.
Use Cases
Examples
1. Extract Vocals for Karaoke
- Background
- A music enthusiast wants to create a sing-along version of a favorite song.
- Problem
- The original audio has vocals mixed with instruments, making it unsuitable for karaoke.
- How to Use
- Upload the song file, select the Spleeter engine for quick processing, and choose MP3 output format.
- Outcome
- The tool outputs a zip file with separate vocal and instrumental tracks, ready for karaoke use.
2. Isolate Dialog in Podcast
Podcast Editor- Background
- A podcast editor needs to clean up an episode where background music obscures the speaker's voice.
- Problem
- Background music reduces dialog clarity and professional quality.
- How to Use
- Upload the podcast audio, use the MDX engine for better quality separation, and output in WAV format for editing.
- Outcome
- The separated dialog track is clear and easy to edit, improving overall audio quality.
Try with Samples
markdown, audio, fileRelated Hubs
FAQ
What audio file formats are supported?
The tool accepts common audio formats including MP3, WAV, FLAC, and others, as indicated by the audio/* file type.
What is the difference between Spleeter and MDX engines?
Spleeter offers faster processing for basic separation, while MDX (Demucs) provides higher quality results but may take longer.
Can I choose the output format for the stems?
Yes, you can select from WAV, FLAC, MP3, M4A, OGG Vorbis, or Opus formats.
How many stems are separated?
Both engines separate audio into 2 stems: vocals and accompaniment.
Is there a file size limit?
Yes, the audio file must be under 200 MB as per the file limit.