Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix sendpay aggregation and ordering in listpays #4567

Merged
merged 25 commits into from
Oct 13, 2021

Conversation

cdecker
Copy link
Member

@cdecker cdecker commented May 31, 2021

This is still work in progress, but I wanted to share the current
state. The main goal is to address #4481 and #4482, i.e., fix the
ordering and address flip-flopping states on supposedly finished
payments. It takes a bit of a detour to get there. The idea is to add
a grouping key groupid to sendpay instances, such that all
sendpay instances that originate from the same pay invocation
share the same groupid. This way listpays can group them into the
correct group when aggregating sendpays for each pay
invocation. However, while implementing this simple approach I noticed
we actually delete prior attempts automagically when retrying them,
which still results in flip-flopping, and is strange if you ask me: if
I called pay 2 times, I should see it 2 times in listpays.

So we now no longer delete prior attempts, but it also means we need
to add groupid to any RPC call that may refer to a specific
sendpay call: waitsendpay, sendonion, sendpay, etc. In
addition we now need to annotate HTLCs we have in the DB with which
(payment_hash, partid, groupid) triple they refer to in order to
correctly route results back to the caller.

I'm still working on some failing tests, but it should just be a
matter of time now :-)

Fixes #4481
Fixes #4482
Fixes #4780

@cdecker cdecker changed the title Fix sendpay aggregation and ordering in listpays Fix sendpay aggregation and ordering in listpays May 31, 2021
@cdecker cdecker marked this pull request as draft May 31, 2021 20:09
@@ -278,6 +278,7 @@ struct htlc_out *new_htlc_out(const tal_t *ctx,
const struct pubkey *blinding,
bool am_origin,
u64 partid,
u64 groupid,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is kinda true, except that we do not support more than one group in flight at a time, so this commit, while not wrong and a nice sanity check, should be unnecessary.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(I wish we could comment on commits not just code!)

lightningd/pay.c Show resolved Hide resolved
@cdecker cdecker force-pushed the sendpay-groups branch 2 times, most recently from 99bbf73 to 80aa906 Compare September 27, 2021 14:52
@cdecker cdecker added this to the v0.10.2 milestone Sep 29, 2021
When doing things like `waitsendpay` without specifying the `groupid`
we likely want to use the latest `groupid` we created, since that's
the one in flight. This adds a function to quickly retrieve that.
@cdecker cdecker force-pushed the sendpay-groups branch 2 times, most recently from ef656e6 to 83d3551 Compare October 6, 2021 15:48
So far we've always been deferring the deletion, retry and early abort
logic to `sendonion` and `sendpay` which do not have the context to
decide if a call is legitimate or not (they were mostly based on
heuristics). By calling `listsendpays` for the invoice's
`payment_hash` we can identify what our `groupid` should be, but more
importantly we can also abort if another payment is pending or a prior
attempt has already succeeded.
This was the main cause of the pay states flip-flopping, since we
reset the status on each attempt any final status is not really
final. Let's keep them around, and provide a stable history.
One of the fundamental constraints of the payment groups idea is that
there may only ever be one group in flight at any point in time, so if
we find a group that is in flight, any new `sendpay` or `sendonion`
must match its `groupid`.
This re-establishes the prior behavior where a `sendpay` or
`sendonion` that'd match a prior payment would cause the prior payment
to be deleted. While we no longer delete prior attempts we now avoid a
primary key collision by incrementing once. This helps us not having
to touch all existing tests, and likely avoids breaking other users
too.
We're about to suspend duplicate calls to `pay` and this will help us
notify them if the original payment completes.
We want to avoid returning duplicate results when cross-completing, so
mark them as completed when we return a result.
Fixes ElementsProject#4482
Fixes ElementsProject#4481

Changelog-Added: pay: Payment attempts are now grouped by the pay command that initiated them
Changelog-Fixed: pay: `listpays` returns payments orderd by their creation date
Changelog-Fixed: pay: `listpays` no longer groups attempts from multiple attempts to pay an invoice
SQL doesn't really allow `a OR 1` as a clause since `1` is not a
boolean expression. Moving it into `a OR 1=1` however is valid again.
plugins/pay.c Outdated
Comment on lines 1977 to 1982
/* New group, reset what we collected. */
if (last_group != groupid) {
completed = false;
pending = false;
last_group = groupid;
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This assumes that listsendpays is listed in increasing groupid order? That assumption does not seem documented?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a listsendpays call limited to the payment_hash of the invoice. There cannot be multiple groups in flight concurrently. So ordering by id which we do internally in lightningd/pay.c will result in the desired ordering of groupid. I'll add a comment to that effect 👍

res = wait_payment(cmd->ld, cmd, rhash, *partid);
if (groupid == NULL) {
groupid = tal(cmd, u64);
*groupid = wallet_payment_get_groupid(cmd->ld->wallet, rhash);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would prefer this function to be called "wallet_payment_get_last_groupid" to be clearer? (or max_groupid)?

@@ -1979,6 +1990,30 @@ payment_listsendpays_previous(struct command *cmd, const char *buf,
completed = false;
pending = false;
last_group = groupid;

parts = 1;
json_scan(tmpctx, buf, t,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, is there no group_id 0? Since you initialize last_group to 0, it seems you won't run this for the first group. Since msat is not initialized elsewhere, this means you will use it uninitialized below?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The pay plugin Will indeed skip over groupid 0, but pure sendpay calls outside of pay may produce that group too

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could either make it a pointer to make that explicit, but the msats field is only used if we detect that there was a prior success or failure and replicate the result for idempotency, so not having a prior attempt means that we'll not use msats at al

@rustyrussell
Copy link
Contributor

Ack 71752df

@rustyrussell rustyrussell merged commit cd7d87f into ElementsProject:master Oct 13, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

listpays vs paystatus listpays returning status failed and then status complete listpays is not sorted
2 participants