Regression in accuracy #419

meakbiyik · 2023-01-17T13:39:06Z

I wanted to create a separate issue for the problems I described in #354. Since 385236d, I have seen severe regression in WER for noisy audio, at around 10-20%. I am attaching a noisy German audio that I can reproduce this with.

willy.mp4

The command I use for both the master branch and above tag are

./bin/stream -l de -m ./gglm-small.bin -kc -ac 512 -t 4 --step 1500 --length 10000

The expected transcription is:

Wir wollen mehr Demokratie wagen. Wir werden unsere Arbeitsweise öffnen und dem kritischen Bedürfnis nach Information Genüge tun.

As noted in the previous issue, I suspect that the main problem is not the temperature or keep-context. My bet would be on either the loss of precision from 32-16 bit conversions, or some bug related to them, since this can directly cause issues with noise robustness (and possibly the overall quality of tiny models) without creating a problem for high-SNR data and bigger models.

The text was updated successfully, but these errors were encountered:

This bug has been present since v1.1.0. Effectively, the past transcribed text wasn't being used for following transcriptions, which likely significantly reduces the transcription quality. Likely related to #419

This bug has been present since v1.1.0. Effectively, the past transcribed text wasn't being used for following transcriptions, which likely significantly reduces the transcription quality. Likely related to ggerganov#419

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Regression in accuracy #419

Regression in accuracy #419

meakbiyik commented Jan 17, 2023

Regression in accuracy #419

Regression in accuracy #419

Comments

meakbiyik commented Jan 17, 2023