Skip to content

DiffSinger colab notebook that uses wav and lab as input (htk lab) for ease of use

Notifications You must be signed in to change notification settings

Coda-SVS/DiffSinger_colab_notebook

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

31 Commits
 
 
 
 

Repository files navigation

DiffSinger_colab_notebook_MLo7

MLo7 DiffSinger training colab notebook an edited copy of Kei's DiffSinger colab notebook

current supported data format:

  • lab + wav (NNSVS format)
  • csv + wav (DiffSinger format)
  • ds + wav (DiffSinger format)

(textgrid/OpenCpop format is not supported at the moment)

Access the notebook here: Open In Colab


IMPORTANT NOTE:

  • please make sure that your audio data is mono, otherwise nnsvs-db-converter will not work
  • your_speaker_folder's folder name will be used as spk_name so please be careful about your file naming
  • colab notebook primarily uses python; thus space in file name or folder path may be invalid
  • for an in-depth guide for SVS training and/or labeling, please see SVS Singing Voice Database - Tutorial

This notebook converts your data (lab + wav) to compatible format via nnsvs-db-converter

It is advised to edit your data using SlurCutter for a more refined data for your pitch model

Zip file format example:

#single speaker (lab + wav | ds + wav)
your_zip.zip:
    |
    |
    your_speaker_folder:
        |
        |
        data_1.wav
        data_1.lab (or.ds)
        .
        data_2.wav
        data_2.lab (or.ds)
        .
        data_3.wav
        data_3.lab (or.ds)
        .
        ...
#single speaker (csv + wav)
your_zip.zip:
    |
    |
    your_speaker_folder:
        |
        |
        wavs (folder named "wavs" containing all the wavs)
        .
        transcriptions.csv
#multi speaker (lab + wav | ds + wav)
your_zip.zip:
    |
    |
    your_speaker_folder_1:
        |
        |
        data_1.wav
        data_1.lab (or.ds)
        .
        data_2.wav
        data_2.lab (or.ds)
        .
        data_3.wav
        data_3.lab (or.ds)
        .
        ...
    your_speaker_folder_2:
        |
        |
        data_1.wav
        data_1.lab (or.ds)
        .
        data_2.wav
        data_2.lab (or.ds)
        .
        data_3.wav
        data_3.lab (or.ds)
        .
        ...
#multi speaker (csv + wav)
your_zip.zip:
    |
    |
    your_speaker_folder_1:
        |
        |
        wavs (folder named "wavs" containing all the wavs)
        .
        transcriptions.csv
    your_speaker_folder_2:
        |
        |
        wavs (folder named "wavs" containing all the wavs)
        .
        transcriptions.csv


Plans (update might not be in order):

  • [script] make gui script for easy navigation if user is planning to train locally
  • [jupyter] add option to use pretrained model
  • [jupyter] add enable/disable checks for breathiness and energy training [THESE TWO OPTIONS ARE OFF BY DEFAULT]
  • [jupyter] make NSF-HiFiGAN vocoder training notebook via fish-diffusion

Credits:

  • openvpi for DiffSinger fork and more

  • UtaUtaUtau for nnsvs-db-converter

  • Kei for the original notebook

  • MLo7 for the notebook edit

  • PixPrucer for an in-depth SVS guide


Extra Note:

Wow you made it to the very bottom.... Why though lmao hahahahhshahhasdksajidhasjl

Feel free to suggest or ask any question via discord my user display name is MLo7 and my user name is ghin_mlo7

About

DiffSinger colab notebook that uses wav and lab as input (htk lab) for ease of use

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Jupyter Notebook 100.0%