Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

optional "failure_onion" in reply to htlc_accepted hook. #4187

Merged
merged 4 commits into from
Nov 11, 2020

Conversation

fiatjaf
Copy link
Contributor

@fiatjaf fiatjaf commented Nov 7, 2020

I need this feature to be able to use "shadow" last hops with fake nodes on @lntxbot and Etleneum and return proper incorrect_or_unknown_payment_details for the pre-pay probes with the wrong payment_hash that Zap, Strike and BoS wallets send. These wallets will not attempt the actual payment if the pre-pay probe fail with any other error -- and it must come from the final destination, which means that in my case I must parse onion.next_onion and extract the ephemeral key, compute the shared secret then encode the onion and return it in the special field "failure_onion" this PR introduces.

The above paragraph is just for the context.

In reality a feature like this is also necessary if we are to write plugins that interface between two different Lightning Networks or similar networks, since a bridge node must be able to forward a failure onion from one network to the other and this PR (hopefully, maybe) enables that kind of thing.

Disclaimer: I have no idea of what I'm doing, but it works and is live on @lntxbot. So maybe someone who knows what they're doing can think of a better way to implement this.

@fiatjaf
Copy link
Contributor Author

fiatjaf commented Nov 7, 2020

I don't know how to write a test for this in Python. I already spent 4 days trying to make my Go code work and I made some heavy usage of lnd's libraries. Here's the part I parse the onion and build the error reply, if anyone is interested (or for the posterity): https://github.com/fiatjaf/lntxbot/blob/bdfd13257334ec21411280ad06c9847f7ca4b0bf/htlc_accepted.go#L79-L137

fiatjaf pushed a commit to fiatjaf/etleneum that referenced this pull request Nov 7, 2020
@jsarenik
Copy link
Collaborator

jsarenik commented Nov 8, 2020

Wow. This is interesting! And a nice small patch. @m-schmoock have a look please.

@rustyrussell
Copy link
Contributor

This is great!! All it needs is:

  1. doc/PLUGINS.md so people know it exists.
  2. Changelog-Added: htlc_accepted hook can now return custom failure_onion.

@rustyrussell rustyrussell added this to the v0.9.2 milestone Nov 9, 2020
@fiatjaf fiatjaf force-pushed the onionreply branch 4 times, most recently from 2b38eb7 to bc81f32 Compare November 9, 2020 23:58
Copy link
Collaborator

@m-schmoock m-schmoock left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great!
Just some nits and I would like to have a pytest blackbox test for it in e.g. tests/test_plugin.py where we do all of the plugin API checks.
If you need assistance ping me.

lightningd/peer_htlcs.c Outdated Show resolved Hide resolved
doc/PLUGINS.md Outdated Show resolved Hide resolved
@fiatjaf
Copy link
Contributor Author

fiatjaf commented Nov 10, 2020

Tried to add a test and discovered that:

  1. Returning an invalid onion causes lightningd to crash, which is more-or-less expected, but I don't know how to account for this in the tests.
  2. There is a memory leak on lightningd upon received the failure_onion:
E           ValueError:
E           Node errors:
E           Global errors:
E            - Node /tmp/ltests-lpfkw05m/test_htlc_accepted_hook_failonion_1/lightning-2/
 has memory leaks: [
E               {
E                   "backtrace": [
E                       "ccan/ccan/tal/tal.c:442 (tal_alloc_)",
E                       "ccan/ccan/tal/tal.c:471 (tal_alloc_arr_)",
E                       "common/json.c:244 (json_tok_bin_from_hex)",
E                       "lightningd/peer_htlcs.c:946 (htlc_accepted_hook_deserialize)",
E                       "lightningd/plugin_hook.c:184 (plugin_hook_callback)",
E                       "lightningd/plugin.c:367 (plugin_response_handle)",
E                       "lightningd/plugin.c:473 (plugin_read_json_one)",
E                       "lightningd/plugin.c:518 (plugin_read_json)",
E                       "ccan/ccan/io/io.c:59 (next_plan)",
E                       "ccan/ccan/io/io.c:407 (do_plan)",
E                       "ccan/ccan/io/io.c:417 (io_ready)",
E                       "ccan/ccan/io/poll.c:445 (io_loop)",
E                       "lightningd/io_loop_with_timers.c:24 (io_loop_with_timers)",
E                       "lightningd/lightningd.c:1015 (main)"
E                   ],
E                   "label": "common/json.c:244:u8[]",
E                   "parents": [],
E                   "value": "0x5594791ec358"
E               }
E           ]

I therefore give up.

Copy link
Member

@cdecker cdecker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very nice, I had missed that failure_msg is not serialized and lacking the HMAC when injecting a previous onion, so this is indeed needed.

Just some minor cleanups.

doc/PLUGINS.md Outdated Show resolved Hide resolved
Comment on lines 943 to 953
} else if ((failoniontok = json_get_member(buffer, toks,
"failure_onion"))) {
failonion = json_tok_bin_from_hex(NULL, buffer, failoniontok);
if (!failonion)
fatal("Bad failure_onion for htlc_accepted"
" hook: %.*s",
failoniontok->end - failoniontok->start,
buffer + failoniontok->start);
fail_in_htlc(hin, take(new_onionreply(tmpctx, failonion)));
return false;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The only difference between failure_onion and failure_msg is that the latter gets serialized with a prefix and HMAC'd. Would it make sense to pull local_fail_in_htlc into this function (basically just the create_onionreply call) and handle both cases the same way? That'd mean that we take an eventual failure_msg, serialize it using the create_onionreply and then store it in failure_onion.

Also I think that failure_onion needs to take precedence over failure_msg, since more work went into building that one. Ideally we'd panic if both are set, but since this is a hook call, our options are limited, so logging an error is likely the best we can do here.

Copy link
Contributor Author

@fiatjaf fiatjaf Nov 10, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about this? https://github.com/fiatjaf/lightning/blob/9af91a8493be3eb3b41d422973f2faf6209a02d2/lightningd/peer_htlcs.c#L930-L988

It's not cool, but I think it is clearer than the way I did it first. And once the deprecated branch is removed it will be much better.

doc/PLUGINS.md Outdated
```json
{
"result": "fail",
"failure_onion": "[292bytes of a serialized error packet]"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The error onion has a variable size, so it'd be better not allude to a fixed size here by putting the 292bytes here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does it? I was pretty sure it had a fixed size.

This proves I shouldn't be writing these things.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are for sure not more or less talented than I am :)
It's about learning not knowing

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why has it a variable size? The spec says it is 260 bytes plus an HMAC of 32 bytes, then it is XORed against a cipherstream of the same size on every hop. I don't understand how it can be different than 292 bytes.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The error may contain a channel_update which itself is a variable length construct.

Can you point me to the place it states that it's 260 bytes?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here: https://github.com/lightningnetwork/lightning-rfc/blob/master/04-onion-routing.md#returning-errors

It says the packet consists of hmac(32) + failure_len(2) + failure + pad_len(2) + pad such that failure + pad = 256. The channel_update and other things I imagine are contained in the failure element above.

Changelog-Added: `htlc_accepted` hook can now return custom `failure_onion`.
@niftynei
Copy link
Collaborator

fixed build error, addressed two outstanding review comments.

@cdecker
Copy link
Member

cdecker commented Nov 10, 2020

ACK 51c3441

Log an error for incorrect use of API

Suggested-By: @cdecker
failmsgtok = json_get_member(buffer, toks, "failure_message");

if (failoniontok) {
failonion = json_tok_bin_from_hex(tmpctx, buffer,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@fiatjaf the memory leak was caused by this line

			failonion = json_tok_bin_from_hex(NULL, buffer,

By allocating it to NULL this means you're planning to manually clean it up; instead we use the magic allocation context tmpctx, which gets cleaned up, via magic.

"'failure_message' provided."
" Ignoring 'failure_message'.");

fail_in_htlc(hin, take(new_onionreply(NULL,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

when you take an object, then is a good time to allocate on NULL, as the take will reparent it to something else.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I get it now, kinda.
Thank you.

@niftynei
Copy link
Collaborator

ACK 6aa61e3

@niftynei niftynei merged commit cefb64c into ElementsProject:master Nov 11, 2020
@fiatjaf fiatjaf deleted the onionreply branch November 11, 2020 01:16
@m-schmoock
Copy link
Collaborator

Really awesome :D 👍

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants