Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

refactor: move HTMLRewriter to c++ bindings #4193

Merged
merged 1 commit into from
Aug 17, 2023

Conversation

bru02
Copy link
Contributor

@bru02 bru02 commented Aug 17, 2023

What does this PR do?

  • Documentation or TypeScript types (it's okay to leave the rest blank in this case)
  • Code changes

How did you verify your code works?

Using this real-wordish HTMLRewriter on ~59k records.

Here are the bench results:
Screenshot 2023-08-17 at 18 07 35

  • I ran make codegen to regenerate the C++ and Zig code

@simylein
Copy link
Contributor

This looks awesome 🙂

@Jarred-Sumner
Copy link
Collaborator

Haven’t read code yet but exciting benchmark results, especially considering the faster one is the debug build of bun.

name: "AttributeIterator",
construct: true,
finalize: true,
JSType: "0b11101110",
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's a JSC function for creating a {next, value} object but we don't have a binding for it right now. This current implementation will be pretty slow due to not re-using the underlying JSC::Structure.

We should add a binding to this function: https://github.com/oven-sh/WebKit/blob/main/Source/JavaScriptCore/runtime/IteratorOperations.cpp#L147


fn contentHandler(this: *DocEnd, comptime Callback: (fn (*LOLHTML.DocEnd, []const u8, bool) LOLHTML.Error!void), thisObject: js.JSObjectRef, globalObject: *JSGlobalObject, content: ZigString, contentOptions: ?ContentOptions) JSValue {
fn contentHandler(this: *DocEnd, comptime Callback: (fn (*LOLHTML.DocEnd, []const u8, bool) LOLHTML.Error!void), callFrame: *JSC.CallFrame, globalObject: *JSGlobalObject, content: ZigString, contentOptions: ?ContentOptions) JSValue {
if (this.doc_end == null)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

one thing to be aware of is that callFrame is a pointer to a stack frame and things can get moved around as the stack frame changes a little

So we should prefer to reference all variables from the call frame at the beginning of the callframe, and not pass it around to non-inline functions

In this case, it's super cheap to get the this value. It's a pointer lookup without other side effects

@Jarred-Sumner
Copy link
Collaborator

This is a clear improvement over the current state, removes longstanding technical debt and improves performance. There are a couple small things to address but those can be addressed in a follow-up PR.

Thank you. Merged.

@Jarred-Sumner Jarred-Sumner merged commit b0e76a9 into oven-sh:main Aug 17, 2023
15 of 20 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants