Sanity-check: good result for 2M steps? #89

branlsnyder · 2022-06-17T09:59:16Z

branlsnyder
Jun 17, 2022

Hi all,

I've been training a rave model (via google colab pro) for about 2M steps* and wanted to share my reconstruction results and ask if this is appropriate quality for 2M steps (or if I'm doing something not-optimally)

My data set is circa 3 hours of Haegeum playing (a solo string instrument with sparse percussion accompaniement), preprocessed to remove silence and reample at 48k

To test reconstruction, I ask it to reconstruct these audio files, which is more Haegeum music (not used in the data set)

My reconstructed sounds. You can recognize the Haegeum under a fair amount of noise. Of course, I'm hoping this noise goes away with training. Those who have trained good models, would you say this result sounds appropriate to you?

*maybe a little less. TensorFlow says 2M, but that might not be where the last checkpoint is at?

jreus · 2022-06-17T10:10:19Z

jreus
Jun 17, 2022

Hi all,

How does my reconstruction sound?

I've been training a rave model (via google colab pro) for about 2M steps* and wanted to share my reconstruction results and ask if this is appropriate quality for 2M steps (or if I'm doing something not-optimally)

My data set is circa 3 hours of Haegeum playing (a solo string instrument with sparse percussion accompaniement), preprocessed to remove silence and reample at 48k

To test reconstruction, I ask it to reconstruct these audio files, which is more Haegeum music (not used in the data set)

My reconstructed sounds. You can recognize the Haegeum under a fair amount of noise. Of course, I'm hoping this noise goes away with training. Those who have trained good models, would you say this result sounds appropriate to you?

*maybe a little less. TensorFlow says 2M, but that might not be where the last checkpoint is at?

To me this does not sound like a good reconstruction after 2M steps, even just with the first stage of training! It sounds like the model is somehow missing the transients and the pitch information of the instrument. I'm not sure why you're not getting better results... it could be there's not enough audio in your dataset. Or there's too much noise / variation in your dataset?

1 reply

branlsnyder Jun 17, 2022
Author

Hmm, I believe my data set has enough audio as it is circa 3 hours of fairly similar music. The only other thing I could think about with the data set is that it is sounds like different performances (all of the same performer) are recorded with different microphones in different ambient spaces. However, I would dare to say those differences are quite minimal.

I noticed that, when training, google colab would only ever export information in the /runs folder, but there is nothing ever present in the temporary folder. Not sure if I should see something in the temp folder or not...

accelotron · 2022-06-19T20:48:52Z

accelotron
Jun 19, 2022

I'm having similar issue. I trained small RAVE for 1.1m iterarions on NSynth dataset (63k velocities with most notes), but model fails to reconstruct high-pitched sounds. I tried different sets of training data and removing noise, but this didn't help.

Are there any updates on your side?

2 replies

accelotron Jul 13, 2022

I trained bigger model and problem resolved :)
Sometimes you just need more parametrs so model could understand such diverse dataset as NSynth.

branlsnyder Jul 17, 2022
Author

glad to hear your's worked out. I still haven't figured out mine. I exported the rave model and noticed it was only 1mb large, which tells me something is definitely off with the model. I might try to make a more detailed post with everything going on. I'm not sure what it is that I may be missing.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Sanity-check: good result for 2M steps? #89

{{title}}

Replies: 2 comments 3 replies

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

Select a reply

Sanity-check: good result for 2M steps? #89

branlsnyder Jun 17, 2022

Replies: 2 comments · 3 replies

jreus Jun 17, 2022

branlsnyder Jun 17, 2022 Author

accelotron Jun 19, 2022

accelotron Jul 13, 2022

branlsnyder Jul 17, 2022 Author

branlsnyder
Jun 17, 2022

Replies: 2 comments 3 replies

jreus
Jun 17, 2022

branlsnyder Jun 17, 2022
Author

accelotron
Jun 19, 2022

branlsnyder Jul 17, 2022
Author