Skip to content

How to Create a High-Quality Dataset

A dataset is a collection of clean, high-quality audio files of a single speaker that Applio uses to train a voice model. The quality of your dataset is the single most important factor in achieving good training results. A high-quality dataset consists of clear, consistent, and noise-free audio.

This guide will walk you through the process of creating a great dataset.

First, you need to collect audio of the person or character you want to create a voice model for.

If your audio contains background music or other sounds, you’ll need to isolate the vocals. For a detailed guide on how to do this, please see our audio isolation guide.

Step 3: Clean and Process Your Audio with Audacity

Section titled “Step 3: Clean and Process Your Audio with Audacity”

Once you have your vocal recordings, it’s time to clean and process them using a free audio editor like Audacity.

Noise reduction helps to remove unwanted background noise from your recordings.

  1. In Audacity, select a small portion of your audio that contains only background noise.
  2. Go to Effect > Noise Removal and Repair > Noise Reduction.
  3. Click Get Noise Profile.
  4. Now, select the entire audio track.
  5. Go back to the Noise Reduction effect and click OK to apply it.

A noise gate is used to silence parts of the audio that are below a certain volume threshold. This is great for removing low-level noise between words and sentences.

  1. Select your entire audio track.
  2. Go to Effect > Gating > Noise Gate.
  3. Apply the recommended settings as shown in the image below. These settings are a good starting point, but you may need to adjust them based on your audio.

A screenshot of the Noise Gate settings in Audacity. The recommended settings are shown.

This effect removes long periods of silence from your audio, which helps to create a more concise dataset.

  1. Select your entire audio track.
  2. Go to Effect > Truncate / Silence > Truncate Silence.
  3. Apply the recommended settings as shown in the image below.

A screenshot of the Truncate Silence settings in Audacity. The recommended settings are shown.

Once you’re happy with your audio, it’s time to export it.

  1. Go to File > Export > Export as WAV (or FLAC).
  2. Choose a location to save your file.
  3. Ensure the format is set to WAV (Microsoft) signed 16-bit PCM or FLAC.
  4. Click Export.
WAV Export SettingsFLAC Export Settings
A screenshot of the WAV export settings in Audacity.A screenshot of the FLAC export settings in Audacity.

Your dataset is now ready for training!


While Audacity is a powerful tool on its own, you can extend its functionality with plugins. These plugins work with Audacity, FL Studio, and other DAWs.

  • T-De-Esser: A de-esser is a tool that reduces sibilance, which is the harsh “s” sound in speech. This is a must-have for creating clean vocal recordings.
  • ReaPlugs: A suite of powerful plugins from the creators of the Reaper DAW. It includes a more advanced noise gate, EQ, and compressor.
  • Auburn Sounds Renegate: A sophisticated noise gate plugin that gives you more control than Audacity’s built-in noise gate.