Skip to content
This repository has been archived by the owner on Nov 29, 2023. It is now read-only.

Optimize expensive XPath expressions in HTML transform #165

Merged

Conversation

eyelidlessness
Copy link
Contributor

After opening #164, it irked me that I knew of a particularly impactful optimization candidate and hadn't addressed it. So this addresses it, as well as several other candidates which could similarly improve performance on forms which transform slowly. The results astounded me, so much so that we've decided to take a different approach to #164. But first, about this change.

Substantive change

The improvement comes from using xsl:key elements to reference the expressions identified as expensive, and to look them up with the corresponding key XPath function. My strong assumption is that the reason this has such a big impact is that the non-key references are rerun on each visit to their respective xsl:template or upon each reference to a top-level xsl:variable, whereas key presumably pre-caches the expression's nodeset and only filters based on the use argument.

Here's what that looks like:

Before:

Name Result
Slowest va_who_v1_5_3.xml 6444.79 ms (0.16 ops/s ±2.37%, 5 runs)
Fastest form-id-with-accent.xml 9.70 ms (103 ops/s ±3.47%, 7 runs)
Average overall 67.41ms
Average without outliers 12.93ms
Total runtime 628.08s

After.

Name Result
Slowest va_who_v1_5_3.xml 543.47 ms (1.84 ops/s ±2.33%, 7 runs)
Fastest arbitrary-html.xml 9.31 ms (107 ops/s ±1.29%, 7 runs)
Average overall 18.59ms
Average without outliers 12.35ms
Total runtime 601.28s

I've seen similar or better improvement in my local dev environment—as with #164, the baseline performance is substantially worse but the change performs roughly the same as CI.

The rest of the change

The change also incorporates:

What's next

These changes have a similar impact on the browser implementation in #164, but they largely eliminate any benefit of running a headless browser for the Node implementation. Besides that, the headless browser implementation adds quite a bit of unnecessary complexity.

We don't want to give up on the other present and potential future benefits discussed in that PR. So after some discussion with @lognaturel, we concluded that it's probably best instead to rework the changes in #164. That will entail:

  1. A compatibility interface for the non-XSL portions of the transform, handling both DOM and libxmljs operations.
  2. The same XSL to TypeScript changes, adapted to use that compatibility interface.
  3. Adapt all tests and benchmarks to run in all target environments. (We should hopefully be able to run these concurrently in CI.)
  4. Restrict headless browser usage to tests. (I expect that this will be the most sensible way to achieve the multi-environment test goal.)

This commit has been cherry picked from its original appearance in enketo#164, with the following changes:

- Include all enketo-core fixtures in preparation for adding benchmarks to establish a performance baseline.
- Temporarily increase the timeout for snapshot tests so that the longer running enketo-core fixtures can generate initial snapshots.
- Restore test shuffling (turning that off was a mistake in the original TypeScript PR).
- Switch the default test reporter to `dot` to improve visibility of progress during test runs (it's still not great, mainly due to performance).
- Move `linkedom` stuff to `/test/shared.ts` as it will only be used in tests.
- Accept updates to `package-lock.json` with SHA-512 integrity hashes.
- Excludes formatting of XML/XSL.

About snapshot serialization:

Snapshots use a custom serializer designed to identify only meaningful differences. This means that a snapshot matches if:

- the value is exactly equal
- the value is equal, apart from insignificant changes to whitespace and/or attribute order

The serialization logic:

1. Manually ensures consistent attribute order (using `linkedom` as a DOM compatibility library, primarily because its types are much better than those for `@xmldom/xmldom`).
2. Uses `prettier` to normalize whitespace.
3. Uses `pretty-format` (`vitest`'s default) for final serialization.

Apart from adding snapshots, tests have been updated to reference `linkedom`'s types (corrected where necessary) and to its behavior (which is semantically equivalent to the previous `@xmldom/xmldom` usage, but more consistent with a real DOM environment).
This is almost entirely different from the original in enketo#164. Instead it:

- uses benchmark.js rather than implementing our own
- mostly defers to benchmark.js stats logic for reporting
- skips `@actions/core` for generating the GitHub Actions summary, it wasn't really doing a lot for us
test/transformer.spec.ts Outdated Show resolved Hide resolved
// `data-form-id`. The XSL seems to suggest that when an explicit `id`
// is not present, the value of `xmlns` will be copied instead. That
// behavior would be quite odd! Nonetheless, these tests ensure that the
// behavior doeesn't change.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As far as I know, this exists to try to make Enketo usable by CommCare. CommCare and ODK share a lot of history but quickly deviated from each other. CommCare uses xmlns in some unexpected ways including as a way to uniquely identify a form definition.

As far as I know this is not used in Enketo.

Copy link
Member

@MartijnR MartijnR Feb 8, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My best guess is that this was found in one of the forms originally hosted on ODK's aggregate.appspot.com (i.e. a form without an id but with a namespace. Back then I was mostly reverse engineering things. Similarly, that's the only place I found forms with relative nodeset (and maybe ref) attributes, I believe (eIMCI form or something), which also added a lot of slowdown and code though that was later commented out.

Great to remove this!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🎉

test/transformer.spec.ts Outdated Show resolved Hide resolved
This should have little impact on forms which already transform quickly, but it dramatically improves the time to transform forms which are particularly slow. In my testing, this is especially the case for forms with a large number of translations/itext values.

The improvement comes from using `xsl:key` elements to reference the expressions identified as expensive, and to look them up with the corresponding `key` XPath function. My strong assumption is that the reason this has such a big impact is that the non-key references are rerun on each visit to their respective `xsl:template` or upon each reference to a top-level `xsl:variable`, whereas `key` presumably pre-caches the expression's nodeset and only filters based on the `use` argument.
@eyelidlessness eyelidlessness force-pushed the performance/expensive-xpath-exprs branch from 8c7290d to ba6f59a Compare February 7, 2023 16:54
@@ -1220,7 +1256,7 @@ XSLT Stylesheet that transforms OpenRosa style (X)Forms into valid HTMl5 forms
<xsl:with-param name="nodeset_u" select="$nodeset_used"/>
</xsl:call-template>
</xsl:variable>
<xsl:variable name="binding" select="/h:html/h:head/xf:model/xf:bind[@nodeset=$nodeset_used] | /h:html/h:head/xf:model/xf:bind[@nodeset=$nodeset]" />
<xsl:variable name="binding" select="key('nodeset-bindings',$nodeset_used) | key('nodeset-bindings', $nodeset)" />
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While I have no reason to doubt the safety of using key here, because I've verified other usage of the nodeset-bindings key, I want to call out that could not find an existing fixture which produces a different value for $nodeset_used and $nodeset here. In other words, all tests pass when using only one or the other.

I was able produce a difference with a contrived change to an existing fixture, but it isn't clear to me what non-contrived scenario this is supposed to cover.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@eyelidlessness: the contrived change was related to relative vs absolute paths

use="local-name()" />
<xsl:key
name="fields-by-ref"
match="/h:html/h:body//xf:input | /h:html/h:body//xf:upload | /h:html/h:body//xf:select | /h:html/h:body//xf:select1"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This matches existing usage. But probably it intends to match all form control elements? https://getodk.github.io/xforms-spec/#body-elements

@@ -429,16 +478,12 @@ XSLT Stylesheet that transforms OpenRosa style (X)Forms into valid HTMl5 forms

<!-- note that bindings are not required -->
<!--<xsl:variable name="binding" select="/h:html/h:head/xf:model/xf:bind[@nodeset=$nodeset_used] | /h:html/h:head/xf:model/xf:bind[@nodeset=$nodeset]" />-->
<xsl:variable name="binding" select="/h:html/h:head/xf:model/xf:bind[@nodeset=$nodeset]" />
<xsl:variable name="binding" select="key('nodeset-bindings', $nodeset)" />

<!-- If this is a bind element that also has an input, do nothing as it will be dealt with by the corresponding xf:input -->
<!-- Note that this test is not fully spec-compliant. It will work with XLS-form produced forms that have no relative nodes
Copy link
Contributor

@lognaturel lognaturel Feb 7, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've read this comment about a hundred times and can't figure out what "forms that have no relative nodes" means. This does look spec-compliant to me. This is not critical for this PR but wanted to mention it since I got tripped up by it.

use="@calculate != ''" />
<xsl:key name="instances" match="/h:html/h:head/xf:model/xf:instance" use="true()" />
<xsl:key
name="primary-instance-root"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm surprised this one makes a difference. Is it requested over and over again? I haven't looked at the context carefully.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@eyelidlessness: targeted expressions that are deep and repeatedly used.

Copy link
Contributor

@lognaturel lognaturel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks big but it's mostly test additions. We've reviewed those as part of #164.

The significant change with key is straightforward to verify.

…json version

Otherwise snapshots will not match as soon as we prepare release
This has been validated locally in enketo-core and enketo-transformer using `npm link`.
@eyelidlessness eyelidlessness merged commit 1bd1ea1 into enketo:master Feb 7, 2023
@MartijnR
Copy link
Member

MartijnR commented Feb 8, 2023

Wow, awesome!! 🥇

@lognaturel
Copy link
Contributor

@MartijnR eager to hear whether you/folks you talk to experience these perf benefits and how they manifest. We're always interested in more benchmark forms if you come across them.

Did I get #165 (comment) right? This has been removed because it didn't work.

@MartijnR
Copy link
Member

MartijnR commented Feb 8, 2023

I will let OpenClinica know about this major change. It looks like it may reduce Enketo's server requirements (to nothing!!).

I am also waiting to see transformation timing results on http://performance.enketo.org/ (User Graphs -> Graphite), but I am not sure if this app still updates successfully (probably doesn't). I no longer seem to have ssh access. Have you folks worked on this server perhaps? If not, I'll force access via the Digital Ocean droplet settings.

Will check the comment and respond there!

@lognaturel
Copy link
Contributor

You gave me SSH access last July with email instructions. I see at least two public keys there that belong to you.

We intended to do something with it but have not had the time. I can see that it has pulled the latest Core. I hadn't seen the transformer graph before, will be interesting to see if those forms show a notable improvement.

@MartijnR
Copy link
Member

MartijnR commented Feb 8, 2023

I see at least two public keys there that belong to you.

Ah thanks, that will help figure out what is going on.

will be interesting to see if those forms show a notable improvement

Yes, especially the 250,000 ms transformation!

@lognaturel
Copy link
Contributor

Looks like it's failing because the node version is too old. Trying to see if I can quickly fix.

@lognaturel
Copy link
Contributor

I unfortunately have to leave things in a bad state with dependency update failures. All this really leads me to wonder what the benchmarks were being run against previously. Will try to come back to it tonight unless someone else can take a look before.

@MartijnR
Copy link
Member

MartijnR commented Feb 8, 2023

Thank you! I will have a look now as well. I dusted off my old laptop and got access (even though I'm using the same public key on that machine).

@MartijnR
Copy link
Member

MartijnR commented Feb 8, 2023

I think it may be churning away again (but will indeed have run an old version for quite a while before today).

Fwiw:
I installed Node 14 (even though that shouldn't have been necessary). Removing node 16 was a little challenging (there seemed to be 2 versions running) so I ended up deleting the folder with the binary after which I could install 14 from nodesource.

@lognaturel
Copy link
Contributor

lognaturel commented Feb 8, 2023

Great, it does look like it's running. And yes, looks like it was running a really old version previously so we can't really meaningfully compare this latest set of improvements.

The worst transform time I'm seeing is 7647 for bench 6. Does that look right to you? Did it really take 250kms at some point? I can't really tell which form would have taken that long. bench 1 or ukraine, maybe? Those now take 3330 and 671, respectively.

@lognaturel
Copy link
Contributor

Have checked out core 6.1.7 and am doing a run on that. The graphs are messed up anyway so I don't think it matters that we send data from different versions.

78,876ms for bench 11 so yeah, things have changed a lot.

@lognaturel
Copy link
Contributor

lognaturel commented Feb 8, 2023

Here's a before and after of transformation time in seconds.
Screenshot 2023-02-16 at 14 58 20

Here's a look from the perf server. Until 12pm, the slowest form takes 250k ms (~4mins) to transform. Then we fiddled with the server for a bit to get the new code running. After 3 hours, it comes back with the slowest transform taking 7s.

Screen Shot 2023-02-08 at 3 19 46 PM

@lognaturel
Copy link
Contributor

lognaturel commented Feb 8, 2023

From perf server: Ubuntu 14.04.6 LTS, 2GB RAM, 1 ~2.2Ghz vCPU

Form Core 6.1.7 Core 7.0.0 Percent change
bench12 250539 4949 -98%
bench6 164944 7598 -95%
burundi 125001 2711 -98%
bench1 98564 3278 -97%
uganda 79848 2585 -97%
bench11 78876 2999 -96%
shop 19745 984 -95%
bench9 19045 1414 -93%
drc 18991 1066 -94%
haiti 18930 972 -95%
bench8 18243 847 -95%
car 9405 804 -91%
ukraine 8988 785 -91%
turkey 7432 733 -90%
sdiprofile 3845 515 -87%
iraq 3241 614 -81%
bench4 2837 501 -82%
bench10 2710 210 -92%
bench2 2683 525 -80%
bench5 2061 362 -82%
bench3 728 180 -75%
widgetsdev 186 145 -22%
widgets 181 140 -23%
bench7 137 110 -20%
lagtest50 50 44 -12%

Screen Shot 2023-02-08 at 9 01 29 PM

@MartijnR
Copy link
Member

MartijnR commented Feb 9, 2023

GLORIOUS!!

98% improvement on the slowest form!

Here is another result (ignore period between 14:00 and 20:00):

Screenshot 2023-02-09 at 8 52 53 AM

I think I should downgrade to the cheapest droplet for this performance monitoring tool (maybe in a few days) as it looks like Enketo will no longer have any problem with that. That's a really great development for all those running big servers. (cc @jnm, @pbowen-oc )

@lognaturel
Copy link
Contributor

Thanks for transferring the perf server, @MartijnR! https://github.com/enketo/enketo-core-performance-monitor has been updated to include transformer benchmarks and use the new URL.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants