Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Floki.find doesn't support escaped colons in class names #411

Closed
fakenickels opened this issue Jul 14, 2022 · 3 comments · Fixed by #458
Closed

Floki.find doesn't support escaped colons in class names #411

fakenickels opened this issue Jul 14, 2022 · 3 comments · Fixed by #458
Labels

Comments

@fakenickels
Copy link

fakenickels commented Jul 14, 2022

Description

Floki doesn't support selectors with escaped colons in them, probably it's mixing them up with pseudo selectors

To Reproduce

Steps to reproduce the behavior:

  • Using Floki v0.33.1
  • Using Elixir v1.13.3
  • Using Erlang OTP v24
  • With this code:
 > Floki.find(document, "a.xs\\:red-500")
[] 

Expected behavior

 > Floki.find(document, "a.xs\\:red-500")
[{"a", [{"class", "xs\\:red-500"}], ["I'm a link with a funny selector"]}]

The current workaround is to use a[class="xs:red-500"]

@fakenickels fakenickels changed the title Floki.find doesn't support escaped semicolons in class names Floki.find doesn't support escaped colons in class names Jul 18, 2022
@fakenickels
Copy link
Author

I tried adding a escaped colon to this in a way that it doesn't conflict with pseudo-selectors but with no success so far https://github.com/philss/floki/blob/master/src/floki_selector_lexer.xrl#L3

@philss
Copy link
Owner

philss commented Jul 22, 2022

Hi @fakenickels 👋

I don't know yet how to solve on the lexer/parser level, but there is a way that works today: you can create a Floki.Selector by hand.

html = """
<!doctype html>
<html><head><title>foo</title></head>
<body>
<h1>Hello world</h1>
<div class="container">
<a class="xs:red-500" href="https://example.com">My link</a>
</div>
</body>
</html>
"""

doc = Floki.parse_document!(html)

selector = %Floki.Selector{type: "a", classes: ["xs:red-500"]}

Floki.find(doc, selector)
#=> [{"a", [{"class", "xs:red-500"}, {"href", "https://example.com"}], ["My link"]}]

This is not ideal, but at least solve the problem for now. I will try to check this next week.

@philss
Copy link
Owner

philss commented May 29, 2023

hey @fakenickels, would you mind to test the change introduced in #458?

philss added a commit that referenced this issue Jun 2, 2023
This is related to the following:

#458
#411

I decided to push the "cleaning" to the lexer, but I think
for more complex escaping rules, we may need to push back to
Elixir.
philss added a commit that referenced this issue Jun 2, 2023
This is related to the following:

#458
#411

I decided to push the "cleaning" to the lexer, but I think
for more complex escaping rules, we may need to push back to
Elixir.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants