Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

JSON optimizations, cleanups and listsendpays optimizations #3957

Merged
merged 10 commits into from
Aug 21, 2020

Conversation

rustyrussell
Copy link
Contributor

@fiatjaf reported in #3941 that listspays was slow, and indeed, it was.

I created 50,000 payments, and ran some tests.

  • With no optimization:
    • listsendpays takes 0.983 seconds
    • listpays takes 52.415 seconds
  • With -O3 -flto:
    • listsendpays takes 0.628 seconds
    • listpays takes 43.104 seconds

After these optimizations (mostly FIXMEs!) the results are:

  • With no optimization:
    • listsendpays takes 0.676 seconds
    • listpays takes 1.545 seconds.
  • With -O3 -flto:
    • listsendpays takes 0.416 seconds
    • listpays takes 0.971 seconds.

Tested on a test node which had made 50,000 payment, with no optimization.

For comparison, time for 'listsendpays' was 0.983s.

time lightning-cli -R --network=regtest --lightning-dir /tmp/ltests-k8jhvtty/test_pay_stress_1/lightning-1/ listpays > /dev/null

Before:
	real	0m52.415s
	user	0m0.127s
	sys	0m0.044s

After:
	real	0m42.741s
	user	0m0.149s
	sys	0m0.016s

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Changelog-Fixed: libplugin: significant speedups for reading large JSON replies (e.g. calling listsendpays on large nodes, or listchannels / listnodes).
We're going to change the API on the more complete JSON parser, so
make and use a simple API for the easy cases.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
They should all show the complete JSON, so unify them.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Copy link
Collaborator

@ZmnSCPxj ZmnSCPxj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ACK 94058b4

Minor quibble though.

plugins/libplugin.c Outdated Show resolved Hide resolved
@rustyrussell
Copy link
Contributor Author

Fixed core dump in test-json, which I hadn't fixed up to the new API (at once stage I used NULL args to json_parse_input instead of implementing a separate json_parse_simple).

Copy link
Collaborator

@darosior darosior left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ACK 7471a83

Impressive!

@rustyrussell
Copy link
Contributor Author

... and the other test where I made the same damn mistake...

toks = json_parse_input(str, str, strlen(str), &valid);
toks = toks_alloc(str);
jsmn_init(&parser);
valid = json_parse_input(&parser, &toks, str, strlen(str), NULL);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NULL will be dereferenced, maybe json_parse_input_simple ?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same below

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(did not submit the review)...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

YEah, originally instead of the json_parse_simple() API I allowed NULL args for parser and complete. But I reverted that, and of course, missed this :(

Fixed in the obvious way now...

The jsmn parser is a beautiful piece of code.  In particular, you can parse
part of a string, then continue where you left off.

We don't take advantage of this, however, meaning for large JSON objects
we parse them multiple times before finally having enough to complete.

Expose the parser state and tokens through the API, so the caller can pass
them in repeatedly.  For the moment, every caller is allocates each time
(except the unit tests).

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
time lightning-cli -R --network=regtest --lightning-dir /tmp/ltests-k8jhvtty/test_pay_stress_1/lightning-1/ listpays > /dev/null

Before:
	real	0m42.741s
	user	0m0.149s
	sys	0m0.016s

After:
	real	0m13.674s
	user	0m0.131s
	sys	0m0.024s

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Changelog-Fixed: JSON-RPC: significant speedups for plugins which create large JSON replies (e.g. listpays on large nodes).
This doesn't make any difference, since lightningd generally sends us
short commands (command responses are via the rpc loop, which is
already done), but it's harmless.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
memmem is also O(n^2), though it's faster.  Now we have infrastructure,
let's do incremental parsing.

time lightning-cli -R --network=regtest --lightning-dir /tmp/ltests-k8jhvtty/test_pay_stress_1/lightning-1/ listpays > /dev/null

Before:
	real	0m13.674s
	user	0m0.131s
	sys	0m0.024s

After:
	real	0m12.447s
	user	0m0.143s
	sys	0m0.008s

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
time lightning-cli -R --network=regtest --lightning-dir /tmp/ltests-k8jhvtty/test_pay_stress_1/lightning-1/ listpays > /dev/null

Before:
	real	0m12.447s
	user	0m0.143s
	sys	0m0.008s

After:
	real	0m2.054s
	user	0m0.114s
	sys	0m0.024s

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We have sanity checks in there that it's a valid point.  Simply store
the JSON token like we do with others.

time lightning-cli -R --network=regtest --lightning-dir /tmp/ltests-k8jhvtty/test_pay_stress_1/lightning-1/ listpays > /dev/null

Before:
	real	0m2.054s
	user	0m0.114s
	sys	0m0.024s

After:
	real	0m1.781s
	user	0m0.127s
	sys	0m0.013s

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We've never hit this, we do check them on insert, and it's slowing
down some operations unnecessarily.

$ time lightning-cli -R --network=regtest --lightning-dir /tmp/ltests-k8jhvtty/test_pay_stress_1/lightning-1/ listpays > /dev/null

Before:
	real	0m1.781s
	user	0m0.127s
	sys	0m0.013s

After:
	real	0m1.545s
	user	0m0.124s
	sys	0m0.024s

Also, the raw listsendpays drops from 0.983s to 0.676s.

(With -O3 -flto, listsendpays is 0.416s, listpays 0.971s).

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
@rustyrussell rustyrussell merged commit f762f7e into ElementsProject:master Aug 21, 2020
cdecker added a commit to cdecker/lightning that referenced this pull request Aug 26, 2020
PR ElementsProject#3957 improved performance considerably, however we still look over the
entire message for the message separator. If instead we just look in the
incrementally read data, we remove the quadratic behavior for large messages.

This is safe since we then loop over the messages which would drain any
message separator from the buffer before we attempt the next read.
cdecker added a commit to cdecker/lightning that referenced this pull request Aug 27, 2020
PR ElementsProject#3957 improved performance considerably, however we still look over the
entire message for the message separator. If instead we just look in the
incrementally read data, we remove the quadratic behavior for large messages.

This is safe since we then loop over the messages which would drain any
message separator from the buffer before we attempt the next read.

Changelog-Fixed: bcli: Significant speedups for block synchronization
cdecker added a commit to cdecker/lightning that referenced this pull request Aug 28, 2020
PR ElementsProject#3957 improved performance considerably, however we still look over the
entire message for the message separator. If instead we just look in the
incrementally read data, we remove the quadratic behavior for large messages.

This is safe since we then loop over the messages which would drain any
message separator from the buffer before we attempt the next read.

Changelog-Fixed: bcli: Significant speedups for block synchronization
cdecker added a commit to cdecker/lightning that referenced this pull request Sep 1, 2020
PR ElementsProject#3957 improved performance considerably, however we still look over the
entire message for the message separator. If instead we just look in the
incrementally read data, we remove the quadratic behavior for large messages.

This is safe since we then loop over the messages which would drain any
message separator from the buffer before we attempt the next read.

Changelog-Fixed: bcli: Significant speedups for block synchronization
rustyrussell pushed a commit to cdecker/lightning that referenced this pull request Sep 2, 2020
PR ElementsProject#3957 improved performance considerably, however we still look over the
entire message for the message separator. If instead we just look in the
incrementally read data, we remove the quadratic behavior for large messages.

This is safe since we then loop over the messages which would drain any
message separator from the buffer before we attempt the next read.

Changelog-Fixed: bcli: Significant speedups for block synchronization
rustyrussell pushed a commit to cdecker/lightning that referenced this pull request Sep 2, 2020
PR ElementsProject#3957 improved performance considerably, however we still look over the
entire message for the message separator. If instead we just look in the
incrementally read data, we remove the quadratic behavior for large messages.

This is safe since we then loop over the messages which would drain any
message separator from the buffer before we attempt the next read.

Changelog-Fixed: bcli: Significant speedups for block synchronization
cdecker added a commit that referenced this pull request Sep 2, 2020
PR #3957 improved performance considerably, however we still look over the
entire message for the message separator. If instead we just look in the
incrementally read data, we remove the quadratic behavior for large messages.

This is safe since we then loop over the messages which would drain any
message separator from the buffer before we attempt the next read.

Changelog-Fixed: bcli: Significant speedups for block synchronization
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants