Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Japanese kanji search reads how its read, not how its written #146

Closed
Nishizuki opened this issue Sep 26, 2021 · 2 comments
Closed

Japanese kanji search reads how its read, not how its written #146

Nishizuki opened this issue Sep 26, 2021 · 2 comments

Comments

@Nishizuki
Copy link

Title, example would be how 君 will match all characters read as "jun", while 春 will match all characters read as "chun"

@rdoeffinger
Copy link
Owner

Thanks for pointing that out.
The basic issue here is that the dictionaries are based on a single sorted index (that's the order you see when scrolling through), and that is based on a transliteration of whatever was entered.
There is then supposed to be a second step that searches through all matches with the same transliteration to find any that are an exact match to the input, which should handle this case.
However I've only ever tested it on Spanish and it's obviously not working right for Japanese, I don't know when I will have time but I hope it is not that hard to fix and should fix the worst usability issue.
A harder to fix problem is completely relying on a transliterated index. Ideally languages like Japanese would e.g. have a secondary index that work like Japanese dictionaries do and allowed you to use it. Between lack of time and lack of Japanese knowledge on my side that probably will not happen any time soon or at all.

rdoeffinger added a commit that referenced this issue Dec 4, 2021
We need to wind back to first potential match or
we might miss exact matches.
Also improve some comments to better explain what
the different scanning back and forward cases do.
Fixes issue #146, though really is a bugfix for the
issue #131 fix.
@rdoeffinger
Copy link
Owner

I released a fixed version so will close this one, even though it does not resolve the issue of over-reliance on transliteration.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants