Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

new URL gives wrong result on leading backslash #36559

Closed
gnprice opened this issue Dec 18, 2020 · 1 comment
Closed

new URL gives wrong result on leading backslash #36559

gnprice opened this issue Dec 18, 2020 · 1 comment
Labels
confirmed-bug Issues with confirmed bugs. url Issues and PRs related to the legacy built-in url module.

Comments

@gnprice
Copy link

gnprice commented Dec 18, 2020

  • Version: v15.4.0
  • Platform: Linux 5b80145d2618 4.19.0-13-amd64 #1 SMP Debian 4.19.160-2 (2020-11-28) x86_64 GNU/Linux
  • Subsystem:

What steps will reproduce the bug?

Parse a relative-URL string that begins with a backslash. For example:

> new URL("\\x", "https://example/foo/bar").href
'https://example/foo//x'

How often does it reproduce? Is there a required condition?

Always

What is the expected behavior?

The leading backslash should have the same behavior as if it were a proper /. The input would be treated as a path-absolute URL (replacing the whole path from the base URL), or a scheme-relative URL (replacing all but the scheme from the base URL).

For example:
new URL("\\x", "https://example/foo/bar").href -> "https://example/x"
new URL("\\\\x", "https://example/foo/bar").href -> "https://x/"

What do you see instead?

The leading backslash is treated incorrectly. The effect seems to be as if the input were a path-relative-URL string -- the base URL's path, except for its last component, appears in the result. In the example:

> new URL("\\x", "https://example/foo/bar").href
'https://example/foo//x'

Additional information

The behavior of new URL is documented as being defined by the WHATWG URL Standard. An input string like \x, with a leading backslash, is never a "valid URL string" as defined in that standard... but the standard nevertheless defines what the URL constructor should return for it.

Because the example input \x is so short, it's not hard to walk through the URL parser as defined in the URL Standard and confirm what result the standard calls for. For the base URL of https://example/, it goes from "scheme start state" to "no scheme state" to "relative state" to "relative slash state" to "path state", following exactly the same track as an input of /x would do, except only that \x emits a validation error. In the URL parser as defined by the URL Standard, a "validation error" does not affect the parser's result, so the resulting URL should be the same as for /x.

As a different kind of check, Chrome (87.0.4280.88) gives the correct answer according to the spec. In the browser console:

> new URL("\\x", "https://example/foo/bar").href
"https://example/x"

So does Firefox (78.0):

» new URL("\\x", "https://example/foo/bar").href
← "https://example/x"
@PoojaDurgad PoojaDurgad added the url Issues and PRs related to the legacy built-in url module. label Dec 18, 2020
@addaleax addaleax added the confirmed-bug Issues with confirmed bugs. label Dec 19, 2020
@schamberg97
Copy link
Contributor

Also occurs on 14.15, will look further into it this night

RaisinTen added a commit to RaisinTen/node that referenced this issue Dec 25, 2020
The associated condition mentioned in the URL parsing algorithm of the
WHATWG URL Standard is:
url is special and c is U+005C (\)
So, `special_back_slash` must be updated whenever `special` is updated.

Fixes: nodejs#36559
danielleadams pushed a commit that referenced this issue Jan 12, 2021
The associated condition mentioned in the URL parsing algorithm of the
WHATWG URL Standard is:
url is special and c is U+005C (\)
So, `special_back_slash` must be updated whenever `special` is updated.

Fixes: #36559

PR-URL: #36613
Reviewed-By: Rich Trott <rtrott@gmail.com>
Reviewed-By: Daijiro Wachi <daijiro.wachi@gmail.com>
Reviewed-By: James M Snell <jasnell@gmail.com>
targos pushed a commit that referenced this issue May 1, 2021
The associated condition mentioned in the URL parsing algorithm of the
WHATWG URL Standard is:
url is special and c is U+005C (\)
So, `special_back_slash` must be updated whenever `special` is updated.

Fixes: #36559

PR-URL: #36613
Reviewed-By: Rich Trott <rtrott@gmail.com>
Reviewed-By: Daijiro Wachi <daijiro.wachi@gmail.com>
Reviewed-By: James M Snell <jasnell@gmail.com>
moz-v2v-gh pushed a commit to mozilla/gecko-dev that referenced this issue Jun 15, 2021
…ementation, a=testonly

Automatic update from web-platform-tests
URL: Add some possible bugs seen in implementation

Collected from:

- nodejs/node#36559
- https://bugs.webkit.org/show_bug.cgi?id=226136
- https://crbug.com/1212318

--

wpt-commits: 2cfdb63014d1158fd15eb1f798f6b1610c275271
wpt-pr: 29271
jamienicol pushed a commit to jamienicol/gecko that referenced this issue Jun 23, 2021
…ementation, a=testonly

Automatic update from web-platform-tests
URL: Add some possible bugs seen in implementation

Collected from:

- nodejs/node#36559
- https://bugs.webkit.org/show_bug.cgi?id=226136
- https://crbug.com/1212318

--

wpt-commits: 2cfdb63014d1158fd15eb1f798f6b1610c275271
wpt-pr: 29271
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
confirmed-bug Issues with confirmed bugs. url Issues and PRs related to the legacy built-in url module.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants