-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
I18N objections to reducing accept-language #10
Comments
thanks for the information. We are not reducing the accept-language header to just the first entry. What we do is reducing the accept-language to the most preference representation, and only browser knows the full preferences list of a users, and browser will do the language negotiation to send the most preference language to the sites. |
The negotiation requires backend support. Servers all over the world need to change for this. This is a million times more difficult than a single browser to make changes. And Websites that only rely on client side I18N (if server don't support this) can't do negotiation. |
The majority multilingual sites don't care about what actually set in the accept-language header. Most sites rely on the geo-location or ui-driven to redirect users to the correct language page. |
It is useful to provide language selection on UI. But this proposal mainly affects the default display language of the site, if you need to manually adjust the language of the site every time you visit a new website, it is too troublesome. By the way, there are not many websites that change the language based on the user's geographical location.
Static generated sites, e.g. created by Jekyll or Hugo, are just static files. They can be deployed on any web server. They usually use a pure client-side approach to achieve multi-language support, without relying on the server. |
I mean the change only impact sites care about the accept-language header to sent the right language page for the users, it's not make sites based on UI to implement the multilingual feature.
are those tools care about the accept-language header in the requests or using js interface to get users' languages preference? |
What I understand about "ui-driven to redirect users to the correct language page" is a language select list on the site's UI.
Doesn't rely on the accept-language header. Use |
what first consider reduce the accept-language header. For the JS interface, what probably need to handle different as most feedback or concerns are related to this. We will double check and review the impacts once we actually rolling out the change. |
I think there are many websites who care about the accept-language header. It may be a very long process for them to support the negotiation as I said before. |
Some sites, including some quite large sites, depend on A-L as at least a hint to providing the right user experience for new users, even with geo-location or other UI. That's not the same as solely relying on the header's contents. Language negotiation and management for users is not entirely simple.
This is helpful. How will the browser know how to reduce the requested set?
+1 ... but change has to start somewhere. Could sites support legacy headers while also implementing a negotiation scheme? |
yea, I don't think we will land any change in a short time. We need to collect enough feedbacks to slowly rolling out.
site can support both reduce accept-language header or full accept-language header, if they care about the language preference in the header. see demo sites: https://developer.chrome.com/blog/origin-trial-for-accept-language-reduction/#demo |
@aphillips I'm trying to bring back the discussion with the i18n group. I have some questions:
|
@aphillips any comments on how Safari is handled? I haven't been able to discover a lot of i18n bugs reported against WebKit given their current behavior.
@hanguokai if we exposed a |
Two things I want to point out here. First, web sites definitely do use Accept-Language information, including fallback languages. It is quite easy to see how this is done. For example, in Chrome:
I hit several sites with my local IP, which is currently somewhere in California, USA. Among the sites that obeyed my Accept-Language:
It is satisfying as an i18n engineer to see web sites do the right thing here. Second, fallback language can be useful for inferring other locale extensions. For example, a common Accept-Language may be something like
I can't speak to Safari's decisions regarding Accept-Language, except to point out that it's well established that users don't often report i18n-related issues; they assume it's their fault that something isn't working. If a user's true Accept-Language is something like I have pointed @Constellation to this discussion to weigh in from WebKit's point of view. |
I think this is a false statement that "safari since it only contains one language".
From all I can see, Sarfari always includes one or more fallback in the Accept-Language header. Some of them have the second one with a q value for fallback, other has a third one with a q value for additional fallback. For user who set Gujirati as their primary language and Hindi as their secondary language in the System settings, Safari will even send out a fouth one with a q value for the third fallback ("gu-IN,gu;q=0.9,hi-IN;q=0.8,hi;q=0.7"). Below are result from Safari "Version 17.4.1 (19618.1.15.11.14)" on my Mac Air MacOs 14.4.1 (23E224) Here is what I see in the HTTP header if you change the UI to Hindi, English, and Traditional Chinese:
As you can see from the above header, all of above Accept-Language sent out by Safari version 17.4.1 include two (not one) languages. The first line include hi-IN as the first language (Hindi used in India), with a second language hi (Hindi as not specific to any region in the world) as fallback with weight Per definintion of HTTP 1.1 these lines all have two languages, not one! Accept-Language = 1#( language-range [ weight ] ) |
We don't mean web sites are not using Accept-Language, what we saw is the number of sites using Accept-Language is low comparing overall sites in the web.
our proposal took consideration for fallback language, for example if the primary language is
I would like to make it clear for safari case, one language means only one of user's preferred languages takes effect, For example, If user set two or more preferred language as |
Could you published how many and which sites have you collected data in your research and how you "saw" that? What experimental method did you use to conclude such finding?
AND 2) not return Zhuang content if
AND 3) will NOT return French content if the Accept-Language: is set to
If all three of the condition above is true, that mean that site does not listen to the fallback, right? Otherwise, the site MAY listen to the fallback. (in that case, the website support French, but not Zhuang, but will not return French if it is French is only in the fallback list when Zhuang is not supported) Since you already "saw is the number of sites using Accept-Language is low" , could you tell us what is the percentage? how low? 50% 45% or 40%?
If the user has
and the site has content of Canadian French (fr-CA), but not Algeria French (fr-DZ), how would your proposal make it it return Canadian French instead of Frace French (fr-FR) ?
First, If that is truely what you mean, then since in your proposal you also write
Could you please change that example to show
to make sure people understand in your proposal the Accept-Language: will still output 2 language-range (since that also totally fit your interpretation of "contain only one" , right?) as long as they came from one item.
as a better example since then people won't misunderstand the second one could only be a substring of the first one plus ";q=0.9" Second, how did you conclude that? Do you have access to Safari source code to verify that? What is the logic of outputting that Accept-Language headers in Safari now. Could you show me the algorithm how it output
when I select Chinese (Traditional) and the second case Cantonese (Traditional)
Also, when a Gujarati user set their System settings to use Gujarati as the first language and Hindi as the second language, Safari will output Accept-Language header as
and this is a clear counterexamput oppose to the statement "only one of user's preferred languages takes effect" since both Gujirati and Hindi are output into the Accept-Languge header. They are listed as two different languages in the "eight schedule" of The Constitution of India (language 5 and language 6 in p.325 of The Constitution of India ) |
@FrankYFTang for |
"zh-HK" in 17.4.1(19618.1.15.11.14)on my MacOS 14.4.1(23E224) |
I gave several examples of high traffic web sites doing the right thing with Accept-Language. For the ones using a different inference mechanism such as GeoIP, there might be specific business needs like commerce, or they might just have not invested fully in proper internationalization, which is a problem I see too often.
You may be misunderstanding my second point. Note that I am using eu (Basque, less common) falling back to English. Also, your proposed Available-Languages response header doesn't seem practical to me. The best web sites are translated into at least 70 languages, sometimes more than 150. That's a lot of data to include. Accept-Language is comparatively small! |
Here is a study about statstics Works Cited Gration, Elizabeth. “Bilingualism Statistics in 2024: US, UK & Global.” Language |
Also, certain countries in the world require multilingual support since people in that countries usually use more than one language in their daily life: Tier 1: High Importance
Tier 2: Moderate Importance
Tier 3: Lower Importance
|
Here are some technology / framework which support Accept-Language fallback that many website build on top of, according to my new BFF Gemini Many popular web frameworks provide built-in support or mechanisms for handling the HTTP Accept-Language header and implementing fallback behavior. Here are a few examples:
|
Is that a true statement? I really doubt your claim. Please list 10 sites that you think which is the case. I have hard time to find any site on the web which does not use Accept-Language or Accept-Language fallback for language content negotation. |
Just to add to the ecosystem support research: Node.js: https://www.npmjs.com/package/i18next-http-middleware is probably the most popular Node.js i18n plugin. It considers Accept-Language for language detection, with fallbacks to other language detection modes. I looked at the code and verified that it handles the fallback list, including Python Django: https://docs.djangoproject.com/en/5.0/topics/i18n/ explains how it uses Accept-Language. The code appears to correctly handle fallback values. WordPress: https://translatepress.com/docs/addons/automatic-user-language-detection/ is a plugin to use either Accept-Language or GeoIP for language detection. I could not find the source code so wasn't able to verify if it handles fallback. |
Give you a real life example of how multilingual website utilize Accept-Language in the bay area Many Malaysia born Chinese know both Malay and Chinese, if you configure their Chrome language setting to
Fremont city goverment website support English, Traditional Chinese, Simplified Chinese, Korean, Vietnamese, Spanish, Hindi, and Panjabi but not Malay If their website does not support Accept-Language fallback, the user will get English as default If Chrome implement your proposal it will send out only
and get English as well Sites you can try Fremont city goverment website https://www.fremont.gov/ |
I really have a hard time to understand how can the goal of this proposal to be achieved Problem: ... As part of the Chrome team’s anti-covert tracking efforts, we would like to improve privacy protections by minimizing passive fingerprinting surfaces. ... For any users who do not touch their language preference, your proposal will not reduce any entropy for them there will be no impact to their privacy by your proposal, right? Now, there are X% of users bother to add additional items to their language preference. Could you share with us what is X based on your study? If X is very small, then your proposal will have a very tiny impact, right? If X is big then your proposal will have a big impact and that impact could be either positive or negative. But since you propose this change, it should be reasonable for me to ask you to share with us what is your estimation of X and your research method to conclude that, right? |
From Frank's comment above, we know that 43% of Web users are multilingual, and of those, some percentage will use a multilingual fallback list that deviates from what we could assume is the most likely fallback list for a particular language. This proposal most directly impacts the Web Platform experience for those users. |
One other comment about the proposal overall is that it moves Passive Fingerprinting to Active Fingerprinting, but it still allows fingerprinting; the app just needs an extra request to get the additional info. With my work on the Locale Extensions proposal, what I heard from other browser vendors is that they don't consider the shift to Active Fingerprinting to be a significant win for Web Platform privacy. |
Honestly, it's a bit hard to follow and/or respond to this wall of comments (just from today...) - it's coming off as very passionate and a little unfocused. I will assume the passion is coming from a good place, but I would appreciate if we can keep the feedback constructive. Thanks. |
My overall view on this proposalUsers care about their privacy, but they also care about the web browsing experience, so privacy and convenience need to be balanced. HTTP-Header and JS API are the cornerstones of Web i18n. The current problem is that the new proposal subverts it, and the cost of this change is so high (for users and developers) that it is doubtful whether it is really effective and worth pursuing. And the current i18n technology is already relatively complicated in practice. Some details
I can't imagine all the server-side processing logic yet for this proposal. In practice I think it's going to be complicated to support this language negotiation process. |
Thanks @hanguokai - I appreciate your feedback, and it's noted. |
The W3C Internationalization Working Group discussed this proposal in our 2023-03-09 teleconference and I was actioned with creating this response.
The I18N WG is concerned that reducing the accept-language header to just the first entry, while perhaps helpful in reducing fingerprinting, will have a potentially negative impact on multilingual users. We are especially concerned with the potential impact on the speakers of minority languages. A minority language speaker of a language may find that many sites do not support their language and thus they may desire to specify a second language (generally a more common one) to ensure the best match for their preferences.
For example, a speaker who prefers Breton (
br
) [Breton is a regional language found mainly in France] is likely to also speak French (fr
, or perhapsfr-FR
). They would thus like to have an A-L header something like:If the A-L header is reduced to a single entry, they would have to choose either French or Breton for their requests. If they chose Breton, they might find some sites defaulting them to another language (such as English), even when French is available.
We also note that for most users of most browsers the A-L header is usually set to a single entry matching the system runtime locale. Most users do not tailor this configuration. The users who do use the browser's user experience to modify their preferences are taking a specific positive action to assert what they want their browser to send on their behalf. Additional privacy warnings could be provided to users there, but users ought to be allowed to control what their browser emits on their behalf in order to receive the best possible Web experience.
Thanks!
The text was updated successfully, but these errors were encountered: