Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve rel parsing #162

Merged
merged 6 commits into from
Mar 25, 2018
Merged

Improve rel parsing #162

merged 6 commits into from
Mar 25, 2018

Conversation

Zegnat
Copy link
Member

@Zegnat Zegnat commented Mar 22, 2018

  • Parsing of the rel attribute has been improved by using WHATWG’s definition of splitting on whitespace. We also treat it as a set, which means we drop duplicate values. This might spare us a loop or two.

  • The properties (hreflang, media, title, type, text) we add to the rel-urls object should only be added if they weren’t previously set. Overwriting order has been changed to accord with this.

    The test testRelURLsInfoMergesCorrectly is a simple check for the correct behaviour.

  • The final rels array within the rel-urls object has been changed to contain only unique items, sorted alphabetically. This reflects the latest parsing specification change.

    The test testRelURLsRelsUniqueAndSorted is a simple check for both uniqueness and order.

  • The arrays of URLs within the rels object should not contain duplicates. The code has been changed to not add any URLs that are already present in the array. Based on the following line in the parsing spec, that had me reread it thrice as a non-native English speaker:

    • if url is not in the array of the key rel-value in the rels hash then add url to the array

    The test testRelURLsNoDuplicates is a simple check to make sure duplicate URLs aren’t added.

Closes #159 if merged.

@Zegnat
Copy link
Member Author

Zegnat commented Mar 22, 2018

I didn’t touch the alternates parsing because its behaviour isn’t in the parsing specification. I can’t comment on its correctness.

@Zegnat
Copy link
Member Author

Zegnat commented Mar 22, 2018

  • Do not test the availability of text by converting strings to booleans. This creates a falsy test where strings like '0' will be treated as empty. Explicitly test for a string length.

    The test testRelURLsFalsyTextVSEmpty makes sure that an empty string '' and a falsy string '0' are evaluated differently.

This was sparked when looking into how attributes were parsed for microformats/microformats2-parsing#32. I have not tried to otherwise validate how well the parser implements that part of the rel spec and may have to be revisited when the spec issue is resolved.

@gRegorLove
Copy link
Member

👍

@aaronpk aaronpk merged commit 2c6677d into microformats:master Mar 25, 2018
@gRegorLove gRegorLove added this to the 0.4.2 milestone Mar 26, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants