Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix vague language codes caused wrong recognition result #136

Open
wants to merge 5 commits into
base: master
Choose a base branch
from

Conversation

BingLingGroup
Copy link

We know that autosub use the same language codes to process src_language and dst_language. But it isn't specific enough for the api to judge the language. From speech-to-text/docs and translate/docs we know that speech-to-text api language codes are different from translation api language codes. Even the Simplified Chinese version of the docs differs from the English version. (That's totally troublesome)

Simplified Chinese version docs screenshot

English version docs screenshot

You can see the difference of the Chinese language codes between these two docs. And this really matters in some cases.

By the way, although autosub still use the old version of google api to handle the api processing jobs, Google has changed old docs into the new ones. And after my test which I will talk about it later in this passage, at least some of them worked better than the codes before.

In this case, Google won't tell you your language codes are vague and refuse to recognize your speech but it will recognize it using the localized version of the language. For example, in accent version of Chinese we have Cantonese which Hong Kong people use it and Mandarin which is the official language of mainland China. When someone used arguments of -S zh-CN -D zh-CN or -S zh -D zh(I modify the constant.py and test it) like the ones on the English docs to recognize the Mandarin Chinese in Hong Kong IP, he will get something recognized mistakenly by Cantonese. People also mentioned in this #112 (Although in Chinese).

So I modified the constant.py and the __init__.py to use the new version of lang codes. I didn't test the translation api but I think it's usable since the docs talk about the usage above. I also fix the logic bug when -S is given and -D is not given. I hope you can read it and much appreciation for your work on autosub.

Below is the test:

Sorry to offend you but I screenshot the bug mentioned in #87

Hong Kong IP confirm

Recognize the Chinese Mandarin Clip

And we get something totally wrong

If I used the zh-TW lang code

zh-TW is the Taiwan version of Mandarin at least orally they are almost the same.

Same wrong result as the zh-CN one

What about the en recognition in Hong Kong?

I change the audio into another English one to eliminate the concern about whether Hong Kong is a bad place for Google to do the speech-to-text recognition.

It works just fine. At least it matched the lang code.

Now switched back to my modified code which can use the new version of lang codes.

Now is the test_v3. At least the api accept it.

Finally it recognized and gave the probably correct result

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant