Mastering AI-Driven Stem Extraction: Audimee, Kits, and the Quest for Perfect Quality
This project started some time ago thanks to a video James Hype uploaded nearly 10 months ago (https://www.youtube.com/watch?v=3D3Awpe8k8A&ab_channel=JamesHype).
Since I’m essentially an independent artist but completely lack vocal ability and voice, I’ve mostly created instrumental works. I suppose this is one of the primary reasons why I haven’t had many listeners over the years, but let’s face it – I’m no Beyoncé, so there are probably a few other qualities I’m missing too. On top of that, the tracks haven’t been particularly engaging. However, this clip by James Hype changed my perspective on how to work without needing to hire professional artists, who often cost far more than what a private individual without a budget can afford.
In the midst of this maze of AI services, I came across Suno at some point – a service said to work wonders with music. There was significant hype surrounding it, but in practice, it was subpar. Then I discovered Udio. Udio is a similar service to Suno, but with slightly better sound quality. This allows you to separate instruments from vocals, feed the vocals into Audimee, and achieve even higher quality than the original. Although many recordings carried a lot of artifacts in the audio files – in this case, severe distortions to the extent that some words become altered or extremely poor, even unintelligible – this increased the chances of creating finalized songs artificially.
However, even after using these services, quality loss in the final result persisted. It became crucial to reprocess everything in a DAW. Instruments needed replacement, and the genre was adjusted to House/Drum & Bass. Essentially, it meant rebuilding the entire song produced by Udio.
Now, 10 months later, more services offering similar features have emerged. Audimee, which appeared to be developed by a Swedish company, remains my go-to tool for this process. The quality has always been nearly perfect, but due to the presence of artifacts, I kept searching for something even better. That’s when I discovered Kits.AI, which seems close to Audimee’s quality but hasn’t quite reached the same level. After extensive testing, I realized that audio quality can improve by avoiding manual vocal extractions from Udio/Suno and letting the services handle that themselves.
Previously, I always used VirtualDJ to separate instruments, beats, and vocals before uploading these to Audimee. The results varied in quality. Recently, I started testing Kits again to determine if their service could deliver better results. It turned out that their approach was similar: uploading pre-separated audio files led to slightly worse quality than allowing the service to perform the separation.
How to get the best stems from AI-generated music
- Download the music and perform an initial stem separation
Use a tool like VirtualDJ or any software that does a solid job, ensuring no bleed from other channels into the stems. - Upload the full song to Audimee and/or Kits without prior separation
Select the vocal model and export the results. - Compare stems
You’ll likely receive both the original stems and new stems with the potential for better quality than the originals. - Mix the stems in a DAW
Use the original stems as a base to preserve any losses from Audimee and Kits. You can often address overly harsh compression – higher-pitched sounds like ”s” and hi-hats are usually the most affected, often in the range of 5kHz-10kHz. Adjust hi/mid/low frequencies to smooth out issues. - Balance the audio levels
Keep the best-sounding stems dominant while reducing the volume of weaker ones. Use them as support for any missing elements, such as words or notes. - Finalize the track
Fine-tune all aspects to create a polished song ready for distribution.
And voilà – you’ve got a track ready for release!
What about licensing?
Yeah. To achieve all this, you need to ensure that you have the correct licenses to use the material. This is relatively straightforward – as of now, you retain the rights to all your material as long as you are a subscriber to the respective services. However, Udio has an exception: they require you to credit their service in the track title if you use the platform’s output, unless you have a paid subscription.
When reprocessing original stems and integrating AI-generated components, additional legal uncertainties may arise. These gray areas become particularly relevant if the final work is difficult to trace back to its sources. While current copyright laws often favor creators who make substantial, creative contributions, the role of AI complicates traditional interpretations of ownership. According to current discussions on AI and copyright, entirely AI-generated music cannot be copyrighted, but the circumstances are different for AI-assisted creations (https://xchange.avixa.org/posts/music-that-is-entirely-ai-generated-cannot-be-copyrighted-but-who-owns-an-ai-assisted-song). For this reason, it’s essential to carefully evaluate the extent to which AI tools are used and how their output is incorporated into the final work.
On a philosophical level, using AI in the creative process raises questions about authorship and originality. By blending human creativity with AI capabilities, the boundaries of what constitutes a “unique” creation are constantly shifting. In my case, I approach this by ensuring that AI-generated elements serve as tools rather than the foundation of the final work. Transparency about these tools not only respects the creative process but also fosters trust and accountability in a rapidly evolving creative landscape.
But the moral implications of this can undoubtedly spark a whole other debate.
Upptäck mer från Tornevall
Prenumerera för att få de senaste inläggen skickade till din e-post.