[Bug] ValueError: No default align-model for language: sv #57

dazWiLLiE · 2024-08-07T14:55:06Z

Steps to reproduce

Windows.

Downloaded the latest release, already have ffmpeg installed.

Transcription Language: Swedish
Audio source: file (file.mkv)
Transcription method: Whisper X
Output filetype: srt

Clicked on "Generate transcription"

Took around an hour, then I got:

Traceback (most recent call last):
  File "handlers\whisperx_handler.py", line 53, in transcribe_file
  File "whisperx\alignment.py", line 71, in load_align_model
    raise ValueError(f"No default align-model for language: {language_code}")
ValueError: No default align-model for language: sv

An .srt file was created, and looking at the result (here are the first 11 lines):

1
00:00:23,660 --> 00:00:52,381
–Trodde du att jag hade glömt bort dig? –Risto, vad är det du gör? –Varför betalar du inte för? –Jag har inte sett nåt! Jesper! Jag betalar för att du får dubbelt så mycket jag lovar! –Jag vill inte! –Risto, gör inte det! –Titta på mig! –Titta mig i ögonen!

2
00:00:59,838 --> 00:01:01,510
För en väckbara.

3
00:03:30,452 --> 00:03:59,684
–Vad är det som har hänt? –Jag kan tyvärr inte berätta. –Jag ska besöka en vän som bor här. –Vad heter den personen? –Jakob Fivel. –Jag ska kalla på nån. Vad sa du nyligen?

It seems it does a decent job, but it cant split the dialogs correctly.

Perhaps its because there is no align model?

The text was updated successfully, but these errors were encountered:

HenestrosaDev · 2024-08-07T15:06:58Z

Does the program transcribe the entire file?
As for the splitting part, it's indeed due to the lack of an aligment model for the language.

dazWiLLiE · 2024-08-07T15:12:32Z

Yes, the saved file included time up to 1h26m so that should be correct.
Is there an alignment model for swedish?

HenestrosaDev · 2024-08-07T15:22:47Z

WhisperX doesn't have a built-in alignment model for Swedish. However, I'd have to take a look into the possibility of adding alignment models for those languages that are not supported by WhisperX, which may take a while.

HenestrosaDev · 2024-08-07T15:33:34Z

As a temporary fix, try to do the following:

Open the audiotext-v2.3.0 folder.
Open this file: _internal > whisperx > aligment.py
Add the following line below the line "ro": "anton-l/wav2vec2-large-xlsr-53-romanian":
```
    "sv": "KBLab/wav2vec2-large-voxrex-swedish"
```

Don't forget to add a comma at the end of the "ro"... , line.

dazWiLLiE · 2024-08-07T15:40:20Z

Thank you. I'll try it right away.

dazWiLLiE · 2024-08-07T15:58:12Z

Now I got:

Traceback (most recent call last):
  File "handlers\whisperx_handler.py", line 53, in transcribe_file
  File "whisperx\alignment.py", line 71, in load_align_model
    Please find a wav2vec2.0 model finetuned on this language in https://huggingface.co/models, then pass the model name in --align_model [MODEL_NAME]")
ValueError: No default align-model for language: sv

Edit:

alignement.py

DEFAULT_ALIGN_MODELS_HF = {
    "ja": "jonatasgrosman/wav2vec2-large-xlsr-53-japanese",
    "zh": "jonatasgrosman/wav2vec2-large-xlsr-53-chinese-zh-cn",
    "nl": "jonatasgrosman/wav2vec2-large-xlsr-53-dutch",
    "uk": "Yehor/wav2vec2-xls-r-300m-uk-with-small-lm",
    "pt": "jonatasgrosman/wav2vec2-large-xlsr-53-portuguese",
    "ar": "jonatasgrosman/wav2vec2-large-xlsr-53-arabic",
    "cs": "comodoro/wav2vec2-xls-r-300m-cs-250",
    "ru": "jonatasgrosman/wav2vec2-large-xlsr-53-russian",
    "pl": "jonatasgrosman/wav2vec2-large-xlsr-53-polish",
    "hu": "jonatasgrosman/wav2vec2-large-xlsr-53-hungarian",
    "fi": "jonatasgrosman/wav2vec2-large-xlsr-53-finnish",
    "fa": "jonatasgrosman/wav2vec2-large-xlsr-53-persian",
    "el": "jonatasgrosman/wav2vec2-large-xlsr-53-greek",
    "tr": "mpoyraz/wav2vec2-xls-r-300m-cv7-turkish",
    "da": "saattrupdan/wav2vec2-xls-r-300m-ftspeech",
    "he": "imvladikon/wav2vec2-xls-r-300m-hebrew",
    "vi": 'nguyenvulebinh/wav2vec2-base-vi',
    "ko": "kresnik/wav2vec2-large-xlsr-korean",
    "ur": "kingabzpro/wav2vec2-large-xls-r-300m-Urdu",
    "te": "anuragshas/wav2vec2-large-xlsr-53-telugu",
    "hi": "theainerd/Wav2Vec2-large-xlsr-hindi",
    "ca": "softcatala/wav2vec2-large-xlsr-catala",
    "ml": "gvs/wav2vec2-large-xlsr-malayalam",
    "uz": "rifkat/wav2vec2-large-xls-r-300m-uz",
    "ro": "anton-l/wav2vec2-large-xlsr-53-romanian",
    "sv": "KBLab/wav2vec2-large-voxrex-swedish"
}

HenestrosaDev · 2024-08-07T16:28:04Z

Okay, it seems I'll have to take a deeper look into this. I'll keep the issue open until I find a way to solve it.

dazWiLLiE · 2024-08-07T16:29:42Z

Great, thanks!

olawalejuwonm · 2024-09-19T07:38:26Z

HI. I also got the same error for yoruba language

No default align-model for language: yo

What's the temporary fix for that?

HenestrosaDev changed the title ~~[Bug] Windows, a few errors~~ [Bug] ValueError: No default align-model for language: sv Sep 4, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Bug] ValueError: No default align-model for language: sv #57

[Bug] ValueError: No default align-model for language: sv #57

dazWiLLiE commented Aug 7, 2024

HenestrosaDev commented Aug 7, 2024

dazWiLLiE commented Aug 7, 2024

HenestrosaDev commented Aug 7, 2024

HenestrosaDev commented Aug 7, 2024

dazWiLLiE commented Aug 7, 2024

dazWiLLiE commented Aug 7, 2024 •

edited

Loading

HenestrosaDev commented Aug 7, 2024

dazWiLLiE commented Aug 7, 2024

olawalejuwonm commented Sep 19, 2024

[Bug] ValueError: No default align-model for language: sv #57

[Bug] ValueError: No default align-model for language: sv #57

Comments

dazWiLLiE commented Aug 7, 2024

Steps to reproduce

HenestrosaDev commented Aug 7, 2024

dazWiLLiE commented Aug 7, 2024

HenestrosaDev commented Aug 7, 2024

HenestrosaDev commented Aug 7, 2024

dazWiLLiE commented Aug 7, 2024

dazWiLLiE commented Aug 7, 2024 • edited Loading

HenestrosaDev commented Aug 7, 2024

dazWiLLiE commented Aug 7, 2024

olawalejuwonm commented Sep 19, 2024

dazWiLLiE commented Aug 7, 2024 •

edited

Loading