Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add new tests for syntax and ill-formed parser errors and fix... emm... errors #684

Merged
merged 12 commits into from
Nov 22, 2023
Merged
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
26 changes: 12 additions & 14 deletions src/reader/state.rs
Original file line number Diff line number Diff line change
Expand Up @@ -114,21 +114,19 @@ impl ReaderState {
)))
}
BangType::DocType if uncased_starts_with(buf, b"!DOCTYPE") => {
let start = buf[8..]
.iter()
.position(|b| !is_whitespace(*b))
.unwrap_or(len - 8);
if start + 8 >= len {
// Because we here, we at least read `<!DOCTYPE>` and offset after `>`.
// We want report error at place where name is expected - this is just
// before `>`
self.offset -= 1;
return Err(Error::IllFormed(IllFormedError::MissedDoctypeName));
match buf[8..].iter().position(|&b| !is_whitespace(b)) {
Some(start) => Ok(Event::DocType(BytesText::wrap(
&buf[8 + start..],
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be kind of nice if we could document these constants here and above. With a bit of thought most of them are not so difficult to figure out, but they're not obvious at first clance.

e.g.

            BangType::DocType if uncased_starts_with(buf, b"!DOCTYPE") => {
                let doctype_tag_len = 8;
                match buf[doctype_tag_len..].iter().position(|&b| !is_whitespace(b)) {
                    Some(start) => Ok(Event::DocType(BytesText::wrap(
                        &buf[doctype_tag_len + start..],

Feel free to tweak the details (such as const vs let, doctype_tag_len vs DOCTYPE_TAG.len())

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lines 87 - 113 could use the same treatment.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that code already slightly rewritten (I'll push draft PR soon) and does not contain magic numbers (or they documented)

self.decoder(),
))),
None => {
// Because we here, we at least read `<!DOCTYPE>` and offset after `>`.
// We want report error at place where name is expected - this is just
// before `>`
self.offset -= 1;
return Err(Error::IllFormed(IllFormedError::MissedDoctypeName));
}
}
Ok(Event::DocType(BytesText::wrap(
&buf[8 + start..],
self.decoder(),
)))
}
_ => {
// <!....
Expand Down
Loading