Skip to content

Monitoring Training with TensorBoard

TensorBoard is a powerful tool that allows you to visualize and monitor the training process of your voice models. By keeping an eye on the graphs in TensorBoard, you can get a better understanding of how your model is learning and identify potential issues like overtraining.

  1. Launch TensorBoard: Run the run-tensorboard.bat file in your Applio directory, or open a terminal and run the command tensorboard --logdir=logs.
  2. Open in Browser: Open your web browser and navigate to http://localhost:6006 (or the address shown in your terminal).
  3. Navigate to Scalars: In the TensorBoard interface, click on the Scalars tab.
  4. Find the “g/total” Graph: This graph shows the total loss for the generator, which is the most important metric for tracking the overall progress of your training.

A screenshot of the TensorBoard interface, showing the 'g/total' graph in the 'Scalars' tab.

Overtraining happens when a model learns the training data too well, including its noise and imperfections. An overtrained model will perform poorly on new data, so it’s important to stop training at the right time.

The “lowest point” on the g/total graph is a good indicator of when to stop training. This is the point where the loss value is at its lowest before it starts to rise or plateau.

A screenshot of the 'g/total' graph in TensorBoard, with the lowest point highlighted.

To find the best version of your model, look for the lowest point on the g/total graph and note the corresponding step number. You can then find the saved model file (.pth) with the closest step number in your logs folder.

For more advanced users, you can also monitor these graphs:

  • loss/g/mel: The mel spectrogram loss. This should generally decrease throughout training. If it starts to rise, it could be a sign of overtraining.
  • loss/g/kl: The KL divergence loss. This should also decrease over time.
  • loss/d/total: The total discriminator loss. This will likely go up at the beginning of training and then start to decrease.

A healthy training process will show a general downward trend in all these loss values over time. By keeping an eye on these graphs, you can gain a deeper understanding of your model’s training process and make more informed decisions.