Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Failed peerswap from 63 days ago blocking two peers doing any further swapping #290

Closed
TrezorHannes opened this issue Mar 28, 2024 · 14 comments · Fixed by #294
Closed

Failed peerswap from 63 days ago blocking two peers doing any further swapping #290

TrezorHannes opened this issue Mar 28, 2024 · 14 comments · Fixed by #294

Comments

@TrezorHannes
Copy link

Hello everyone,

Problem statement: we have had a failed ps-swapout attempt from my side which failed, and has a state-cancelled on my end. No neither of us can do another swap with each other.

My side has "state": "State_SwapCanceled",

pscli getswap --id f02f50b2317a9a8dfa8a6a8a7ccab7ec25559f2eb3401d10626d4fdc3c1442f5
{
  "swap": {
    "id": "f02f50b2317a9a8dfa8a6a8a7ccab7ec25559f2eb3401d10626d4fdc3c1442f5",
    "created_at": "1705586782",
    "asset": "lbtc",
    "type": "swap-out",
    "role": "sender",
    "state": "State_SwapCanceled",
    "initiator_node_id": "037f66e84e38fc2787d578599dfe1fcb7b71f9de4fb1e453c5ab85c05f5ce8c2e3",
    "peer_node_id": "023e24602891c28a7872ea1ad5c1bb41abe4206ae1599bb981e3278a121e7895d6",
    "amount": "2000000",
    "channel_id": "812324:2395:1",
    "opening_tx_id": "",
    "claim_tx_id": "",
    "cancel_message": "the prepayment probe was unsuccessful: TemporaryChannelFailure(update=(*lnwire.ChannelUpdate)(0xc005d9f600)({\n Signature: [...removed...],\n ShortChannelID: (lnwire.ShortChannelID) 812324:2395:1,\n Timestamp: (uint32) 1705521406,\n MessageFlags: (lnwire.ChanUpdateMsgFlags) 00000001,\n ChannelFlags: (lnwire.ChanUpdateChanFlags) 00000001,\n TimeLockDelta: (uint16) 142,\n HtlcMinimumMsat: (lnwire.MilliSatoshi) 10000 mSAT,\n BaseFee: (uint32) 1000,\n FeeRate: (uint32) 0,\n HtlcMaximumMsat: (lnwire.MilliSatoshi) 9603000000 mSAT,\n ExtraOpaqueData: (lnwire.ExtraOpaqueData) {\n }\n})\n)",
    "lnd_chan_id": "893159683678470145"
  }
}

On the peer side from @jvxis, peerswapd still awaits my payment

pscli listactiveswaps{
  "swaps": [    {
      "id": "f02f50b2317a9a8dfa8a6a8a7ccab7ec25559f2eb3401d10626d4fdc3c1442f5",      "created_at": "[1705586789](tel:1705586789)",
      "asset": "lbtc",      "type": "swap-out",
      "role": "receiver",      "state": "State_SwapOutReceiver_AwaitFeeInvoicePayment",
      "initiator_node_id": "037f66e84e38fc2787d578599dfe1fcb7b71f9de4fb1e453c5ab85c05f5ce8c2e3",      "peer_node_id": "037f66e84e38fc2787d578599dfe1fcb7b71f9de4fb1e453c5ab85c05f5ce8c2e3",
      "amount": "2000000",      "channel_id": "812324:2395:1",
      "opening_tx_id": "",      "claim_tx_id": "",
      "cancel_message": "",      "lnd_chan_id": "893159683678470145"
    }  ]
}

and hence every single swap-in from their side, or swap-out from my side is blocked:
My side

pscli swapout --sat_amt 3100566 --channel_id 893159683678470145 --asset lbtc
2024/03/28 18:48:12 rpc error: code = Unknown desc = already has an active swap on channel 812324:2395:1: f02f50b2317a9a8dfa8a6a8a7ccab7ec25559f2eb3401d10626d4fdc3c1442f5

Any advice or idea how we can resolve the situation, eg canceling the pending payment?

@jvxis
Copy link

jvxis commented Mar 28, 2024

We both have upgraded to the latest version, and the issue remains.

@grubles
Copy link
Collaborator

grubles commented Mar 28, 2024

What PeerSwap commit are you running? The canceled swap seems to be a result of a stuck channel which PeerSwap checks for now, hence the probe which failed. But it seems the swap receiver hasn't canceled the swap for some reason.

@TrezorHannes
Copy link
Author

Just built via the recent commit a few hours ago to doublecheck whether it resolves it

git branch -v
* master fb7d5f7 [skip ci]docs: transaction labels for on-chain transactions (#284)

I believe @jvxis runs the latest as well.

We do have usual LN routing activity running, whilst it's probably good context to know that the peering between our both nodes isn't 100%, since he's in SouthAmerica and my server's based in Germany. So we do have a connection drop every 10-12 hours for a few minutes

@jvxis
Copy link

jvxis commented Mar 28, 2024

Same here: * master fb7d5f7 [skip ci]docs: transaction labels for on-chain transactions (#284)

@grubles
Copy link
Collaborator

grubles commented Mar 28, 2024

What LND versions are you running?

@jvxis
Copy link

jvxis commented Mar 29, 2024

What LND versions are you running?

Here is: LND 0.17.4-beta

@TrezorHannes
Copy link
Author

What LND versions are you running?

admin@debian-nuc:~$ uname -a
Linux debian-nuc 6.1.0-16-amd64 #1 SMP PREEMPT_DYNAMIC Debian 6.1.67-1 (2023-12-12) x86_64 GNU/Linux
admin@debian-nuc:~$ lsb_release -a
No LSB modules are available.
Distributor ID: Debian
Description:    Debian GNU/Linux 12 (bookworm)
Release:        12
Codename:       bookworm
admin@debian-nuc:~$ lnd --version
lnd version 0.17.3-beta commit=v0.17.3-beta

@YusukeShimizu
Copy link
Contributor

It appears that the probe payment failed due to TemporaryChannelFailure.
In that case, a cancel message is also sent to the other peer, but it seems to have failed for some reason.

May need to modify peerswap to resolve.
It may be necessary to continue sending messages until the cancel message of the swap peer is successfully sent, or to add a "recover" process.

@TrezorHannes
Copy link
Author

May need to modify peerswap to resolve. It may be necessary to continue sending messages until the cancel message of the swap peer is successfully sent, or to add a "recover" process.

Thanks for your response. How would we process your suggestion to send messages?
And is the recover process implemented already, or is that a consideration to be added into the feature-set?

@YusukeShimizu
Copy link
Contributor

Both message sending and recovery cannot be manually operated.
I assume that the peerswap codebase will need to be modified.

I think deleting the target swap as a workaround is an option.
The swaps are managed in BoltDB and I believe they exist as swaps in peerswap datadir.
You can use tool such as boltbrowser to delete the target swap, and then you can run the next swap.

@jvxis
Copy link

jvxis commented Mar 30, 2024

@YusukeShimizu It has worked! Thanks. Now I can do peerswaps again.

@TrezorHannes
Copy link
Author

Excellent. Thank you @YusukeShimizu for helping us resolving it.
Let us know if you think you need anything for the message sending approach at a future point. Will close this as resolved.

@wtogami
Copy link
Contributor

wtogami commented Apr 2, 2024

It appears that the probe payment failed due to TemporaryChannelFailure.

This was truly a temporary LND channel problem where the channel worked fine later?
Should we perhaps try a few times instead of giving up immediately?

In that case, a cancel message is also sent to the other peer, but it seems to have failed for some reason.

The cancel message has no ACK? Should we add one and periodically repeat until we hear the ACK?

YusukeShimizu added a commit to YusukeShimizu/peerswap that referenced this issue Apr 2, 2024
`AwaitFeeInvoice`state to hang in case of state mismatch with peer.
ElementsProject#290

To eliminate the persistence of uncompleted swaps,
execute "cancel" if recovery by restart
is executed in `AwaitFeeInvoice` state.
@YusukeShimizu
Copy link
Contributor

YusukeShimizu commented Apr 3, 2024

This was truly a temporary LND channel problem where the channel worked fine later?
Should we perhaps try a few times instead of giving up immediately?

I think TemporaryChannelFailure would be considered a temporary problem that is expected to be recovered.
I think there is a way to retry with Exponential Backoff, etc.

I created draft PR : #294

The cancel message has no ACK? Should we add one and periodically repeat until we hear the ACK?

With SendCancelAction, the implementation does not guarantee sending the custom message because the state is set to State_SwapCanceled after the process is complete, regardless of whether the message is sent or not.
It is possible to repeat the process until success, but I think that adding a recovery process on the receiving side to guarantee result consistency would be the solution.

I created draft PR : #293

YusukeShimizu added a commit to YusukeShimizu/peerswap that referenced this issue Apr 7, 2024
`AwaitFeeInvoice`state to hang in case of state mismatch with peer.
ElementsProject#290

To eliminate the persistence of uncompleted swaps,
execute "cancel" if recovery by restart
is executed in `AwaitFeeInvoice` state.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants