-
Notifications
You must be signed in to change notification settings - Fork 3.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Regression in accuracy #419
Comments
ggerganov
added a commit
that referenced
this issue
Jan 22, 2023
This bug has been present since v1.1.0. Effectively, the past transcribed text wasn't being used for following transcriptions, which likely significantly reduces the transcription quality. Likely related to #419
rock3125
pushed a commit
to rock3125/whisper.cpp
that referenced
this issue
Feb 21, 2023
This bug has been present since v1.1.0. Effectively, the past transcribed text wasn't being used for following transcriptions, which likely significantly reduces the transcription quality. Likely related to ggerganov#419
anandijain
pushed a commit
to anandijain/whisper.cpp
that referenced
this issue
Apr 28, 2023
This bug has been present since v1.1.0. Effectively, the past transcribed text wasn't being used for following transcriptions, which likely significantly reduces the transcription quality. Likely related to ggerganov#419
jacobwu-b
pushed a commit
to jacobwu-b/Transcriptify-by-whisper.cpp
that referenced
this issue
Oct 24, 2023
This bug has been present since v1.1.0. Effectively, the past transcribed text wasn't being used for following transcriptions, which likely significantly reduces the transcription quality. Likely related to ggerganov#419
jacobwu-b
pushed a commit
to jacobwu-b/Transcriptify-by-whisper.cpp
that referenced
this issue
Oct 24, 2023
This bug has been present since v1.1.0. Effectively, the past transcribed text wasn't being used for following transcriptions, which likely significantly reduces the transcription quality. Likely related to ggerganov#419
landtanin
pushed a commit
to landtanin/whisper.cpp
that referenced
this issue
Dec 16, 2023
This bug has been present since v1.1.0. Effectively, the past transcribed text wasn't being used for following transcriptions, which likely significantly reduces the transcription quality. Likely related to ggerganov#419
iThalay
pushed a commit
to iThalay/whisper.cpp
that referenced
this issue
Sep 23, 2024
This bug has been present since v1.1.0. Effectively, the past transcribed text wasn't being used for following transcriptions, which likely significantly reduces the transcription quality. Likely related to ggerganov#419
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
I wanted to create a separate issue for the problems I described in #354. Since 385236d, I have seen severe regression in WER for noisy audio, at around 10-20%. I am attaching a noisy German audio that I can reproduce this with.
willy.mp4
The command I use for both the master branch and above tag are
./bin/stream -l de -m ./gglm-small.bin -kc -ac 512 -t 4 --step 1500 --length 10000
The expected transcription is:
Wir wollen mehr Demokratie wagen. Wir werden unsere Arbeitsweise öffnen und dem kritischen Bedürfnis nach Information Genüge tun.
As noted in the previous issue, I suspect that the main problem is not the temperature or keep-context. My bet would be on either the loss of precision from 32-16 bit conversions, or some bug related to them, since this can directly cause issues with noise robustness (and possibly the overall quality of tiny models) without creating a problem for high-SNR data and bigger models.
The text was updated successfully, but these errors were encountered: