Sanity-check: good result for 2M steps? #89
Replies: 2 comments 3 replies
-
To me this does not sound like a good reconstruction after 2M steps, even just with the first stage of training! It sounds like the model is somehow missing the transients and the pitch information of the instrument. I'm not sure why you're not getting better results... it could be there's not enough audio in your dataset. Or there's too much noise / variation in your dataset? |
Beta Was this translation helpful? Give feedback.
-
I'm having similar issue. I trained small RAVE for 1.1m iterarions on NSynth dataset (63k velocities with most notes), but model fails to reconstruct high-pitched sounds. I tried different sets of training data and removing noise, but this didn't help. Are there any updates on your side? |
Beta Was this translation helpful? Give feedback.
-
Hi all,
How does my reconstruction sound?
I've been training a rave model (via google colab pro) for about 2M steps* and wanted to share my reconstruction results and ask if this is appropriate quality for 2M steps (or if I'm doing something not-optimally)
My data set is circa 3 hours of Haegeum playing (a solo string instrument with sparse percussion accompaniement), preprocessed to remove silence and reample at 48k
To test reconstruction, I ask it to reconstruct these audio files, which is more Haegeum music (not used in the data set)
My reconstructed sounds. You can recognize the Haegeum under a fair amount of noise. Of course, I'm hoping this noise goes away with training. Those who have trained good models, would you say this result sounds appropriate to you?
*maybe a little less. TensorFlow says 2M, but that might not be where the last checkpoint is at?
Beta Was this translation helpful? Give feedback.
All reactions