Creating your own voice model on Kits.ai is easy. Create the best possible voice model by creating a high-quality dataset using the tips below. If you need any additional support, join the Kits.ai Discord or get in touch with us.

Here’s what you need.

15 total minutes (or more! the more audio the better) of dry (no effects) and monophonic (one note at a time) vocals.

Bad Vocals

Stereo, reverb, delay

SG_acapella_wet.wav

Good Vocals

Mono, clean tone, low noise

SG_acapella_example.wav

For best results, create different models for distinct vocal styles (singing vs. rapping, etc.)

Getting your file(s) ready.

Export your files with no silence and consistent volume as a 16-bit lossless audio file (.wav preferred).

Before: silence, inconsistent volume levels

before.png

After: truncated silence, consistent volume

after.png

Once you’ve compiled your vocals, the next step is to prepare your files for training:

How to convert to mono and remove silence with Audacity

example.mov

Advanced Pre-Processing Tips