Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Hubert args normalization #7

Closed
LWprogramming opened this issue Mar 15, 2023 · 2 comments
Closed

Hubert args normalization #7

LWprogramming opened this issue Mar 15, 2023 · 2 comments

Comments

@LWprogramming
Copy link

I looked at the MERT example and noticed that it's actually preprocessing the input. The code looks really convoluted but in the case of batch size 1, the net effect is that it's normalizing things to have zero mean and unit variance instead of passing it in directly.

Note that you can use the processor provided in the example if you want, but I decided not to for my case because my wav_input is already in CUDA and transformers forces everything into numpy and therefore CPU 😞 , resulting in expensive copies as you move data back and forth. Wasn't sure which one you've got here :))

@zhvng
Copy link
Owner

zhvng commented Mar 15, 2023

hey @LWprogramming , thanks for taking a look at the code! The input is normalized to zero mean unit variance when loading the data here. The normalize argument is set when initializing the datasets in trainer. This will normalize the data before it is cropped.

Alternatively we could normalize the cropped input right before passing it into MERT. Not sure which would be better, but normalizing it in the beginning made more sense to me.

@zhvng
Copy link
Owner

zhvng commented Mar 15, 2023

also your comment made me realize there was an issue in the infer coarse script where I forgot to normalize the audio before passing it in. fixed in c9a167e!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants