Hubert args normalization #7

LWprogramming · 2023-03-15T09:08:04Z

I looked at the MERT example and noticed that it's actually preprocessing the input. The code looks really convoluted but in the case of batch size 1, the net effect is that it's normalizing things to have zero mean and unit variance instead of passing it in directly.

Note that you can use the processor provided in the example if you want, but I decided not to for my case because my wav_input is already in CUDA and transformers forces everything into numpy and therefore CPU 😞 , resulting in expensive copies as you move data back and forth. Wasn't sure which one you've got here :))

The text was updated successfully, but these errors were encountered:

zhvng · 2023-03-15T15:44:35Z

hey @LWprogramming , thanks for taking a look at the code! The input is normalized to zero mean unit variance when loading the data here. The normalize argument is set when initializing the datasets in trainer. This will normalize the data before it is cropped.

Alternatively we could normalize the cropped input right before passing it into MERT. Not sure which would be better, but normalizing it in the beginning made more sense to me.

zhvng · 2023-03-15T15:52:57Z

also your comment made me realize there was an issue in the infer coarse script where I forgot to normalize the audio before passing it in. fixed in c9a167e!

zhvng closed this as completed Mar 17, 2023

zhvng mentioned this issue Apr 19, 2023

fix semantic tokens #13

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Hubert args normalization #7

Hubert args normalization #7

LWprogramming commented Mar 15, 2023

zhvng commented Mar 15, 2023 •

edited

Loading

zhvng commented Mar 15, 2023 •

edited

Loading

Hubert args normalization #7

Hubert args normalization #7

Comments

LWprogramming commented Mar 15, 2023

zhvng commented Mar 15, 2023 • edited Loading

zhvng commented Mar 15, 2023 • edited Loading

zhvng commented Mar 15, 2023 •

edited

Loading

zhvng commented Mar 15, 2023 •

edited

Loading