Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RPC sync failure #859

Open
rajarshimaitra opened this issue Feb 10, 2023 · 3 comments
Open

RPC sync failure #859

rajarshimaitra opened this issue Feb 10, 2023 · 3 comments
Labels
bug Something isn't working

Comments

@rajarshimaitra
Copy link
Contributor

Resource temporarily unavailable failure in RPC

[2023-02-10T06:50:19Z ERROR bdk_cli] Rpc(JsonRpc(Transport(SocketError(Os { code: 11, kind: WouldBlock, message: "Resource temporarily unavailable" }))))

It's been known for some time that our rpc sync produces intermittent failures with the above error. This has been reported across many instances. But it doesn't caught in our CI. I have also struggled before to replicate this. Sometimes happens locally while running many test instances parallel. But it doesn't happen while running the sync manually. This made it very hard to debug.

First observed while upgrading electrsd's bitcoind version to 0.23: #613 (comment)

Reported for mac by @notmandatory here : #749

Reported by @evanlinjin here : #650 (comment)

And as @afilini suggested, it's an internal TCP error, either by hitting bitcoin core's parallel request limits, or something even lower at os level.

Recently this has been reported to be happening consistently in signet, and I have been able to reproduce it enough to be able to debug successfully.

To Reproduce

Sync any bdk wallet, from a fresh database, in signet/testnet network using the RPC blcokchain.

Below is a sync output with bdk-cli in debug mode. And the sync ends with the above error.

[2023-02-10T06:50:04Z DEBUG bdk_cli] network: Signet
[2023-02-10T06:50:04Z INFO  bdk::database::sqlite] db up to date, no migration needed
[2023-02-10T06:50:04Z DEBUG bdk_cli::utils] database opened successfully
[2023-02-10T06:50:04Z DEBUG bitcoincore_rpc::client] JSON-RPC request: getnetworkinfo []
[2023-02-10T06:50:04Z INFO  bdk::blockchain::rpc] connected to 'http://127.0.0.1:38332/wallet/coldcard-signet' with auth: UserPass { username: "mempool", password: "mempool" }
[2023-02-10T06:50:04Z DEBUG bitcoincore_rpc::client] JSON-RPC request: listwallets []
[2023-02-10T06:50:04Z INFO  bdk::blockchain::rpc] wallet already loaded: coldcard-signet
[2023-02-10T06:50:04Z DEBUG bitcoincore_rpc::client] JSON-RPC request: getwalletinfo []
[2023-02-10T06:50:04Z DEBUG bitcoincore_rpc::client] JSON-RPC request: getblockchaininfo []
[2023-02-10T06:50:04Z DEBUG bitcoincore_rpc::client] JSON-RPC request: getnetworkinfo []
[2023-02-10T06:50:04Z DEBUG bitcoincore_rpc::client] JSON-RPC request: getindexinfo []
[2023-02-10T06:50:04Z DEBUG bdk::wallet] Begin sync...
[2023-02-10T06:50:04Z DEBUG bdk::wallet] max_address 100
[2023-02-10T06:50:04Z DEBUG bdk::wallet] run_setup: false
[2023-02-10T06:50:04Z INFO  bdk::blockchain::rpc] initial db state: txs=0 utxos=0
[2023-02-10T06:50:04Z DEBUG bitcoincore_rpc::client] JSON-RPC request: importdescri
{"desc":"raw(76a914e081d30ebf6a40e9bbef561320e524bd97b6a16a88ac)#xsvg9g96","timestamp":0},{"desc":"raw(76a914b0c5e022ff90a7e96ab0d838e1dd9e6fc56898cb88ac)#maxefejy","timestamp":0},{"desc":"raw(76a91416740323884d3658eb0aeddacffd1d7280291ec188ac)#jz68cjk4","timestamp":0},
...
{"desc":"raw(76a914dc7456105f9a3252b8b480efdbcad445111b6af788ac)#sjwm52yc","timestamp":0},{"desc":"raw(76a9148a01aba7f7bfda8d3e0636d3136fd1b949e12f6788ac)#qg9uw2sw","timestamp":0}]]
[2023-02-10T06:50:19Z DEBUG bitcoincore_rpc::client] JSON-RPC failed parsing reply of importdescriptors: JsonRpc(Transport(SocketError(Os { code: 11, kind: WouldBlock, message: "Resource temporarily unavailable" })))
[2023-02-10T06:50:19Z ERROR bdk_cli] Rpc(JsonRpc(Transport(SocketError(Os { code: 11, kind: WouldBlock, message: "Resource temporarily unavailable" }))))

Which now clearly says that it's the importdescriptors call that fails.

The corresponding bitcoind log (rpc + http):

2023-02-10T07:25:09Z Received a POST request for /wallet/coldcard-signet from 127.0.0.1:49174
2023-02-10T07:25:09Z ThreadRPCServer method=getnetworkinfo user=mempool
2023-02-10T07:25:09Z Received a POST request for /wallet/coldcard-signet from 127.0.0.1:49180
2023-02-10T07:25:09Z ThreadRPCServer method=listwallets user=mempool
2023-02-10T07:25:09Z Received a POST request for /wallet/coldcard-signet from 127.0.0.1:49188
2023-02-10T07:25:09Z ThreadRPCServer method=getwalletinfo user=mempool
2023-02-10T07:25:09Z Received a POST request for /wallet/coldcard-signet from 127.0.0.1:49202
2023-02-10T07:25:09Z ThreadRPCServer method=getblockchaininfo user=mempool
2023-02-10T07:25:09Z Received a POST request for /wallet/coldcard-signet from 127.0.0.1:49210
2023-02-10T07:25:09Z ThreadRPCServer method=getnetworkinfo user=mempool
2023-02-10T07:25:09Z Received a POST request for /wallet/coldcard-signet from 127.0.0.1:49222
2023-02-10T07:25:09Z ThreadRPCServer method=getindexinfo user=mempool
2023-02-10T07:25:09Z Received a POST request for /wallet/coldcard-signet from 127.0.0.1:49228
2023-02-10T07:25:09Z ThreadRPCServer method=importdescriptors user=mempool
2023-02-10T07:25:09Z [coldcard-signet] Update existing descriptor: raw(76a91487642d5aa2da9b7d1deddf58a7301a295bee1e3a88ac)#0dyxlh20
2023-02-10T07:25:09Z [coldcard-signet] Update existing descriptor: raw(76a91482c3323cd998b3439bea03e724e1469947e3b1a688ac)#rms5exlt
2023-02-10T07:25:09Z [coldcard-signet] Update existing descriptor: raw(76a914db649fc533836305c97beb172fe1100d91ea628b88ac)#3frzx4pa
2023-02-10T07:25:09Z [coldcard-signet] Update existing descriptor: raw(76a914da8431edbb7de0cc3b44071de368d4a977d307f388ac)#5rapgrll
2023-02-10T07:25:09Z [coldcard-signet] Update existing descriptor:
...
2023-02-10T07:25:13Z [coldcard-signet] Update existing descriptor: raw(76a914dc7456105f9a3252b8b480efdbcad445111b6af788ac)#sjwm52yc
2023-02-10T07:25:13Z [coldcard-signet] Update existing descriptor: raw(76a9148a01aba7f7bfda8d3e0636d3136fd1b949e12f6788ac)#qg9uw2sw
2023-02-10T07:25:13Z [coldcard-signet] AddToWallet 4996f1f9fbe4a3857d8d89326c0a8b929a56856f153f21a1a0a0c61ebc02db99  
2023-02-10T06:50:07Z [coldcard-signet] RescanFromTime: Rescanning last 129465 blocks
2023-02-10T06:50:07Z [coldcard-signet] Rescan started from block   
2023-02-10T06:50:46Z [coldcard-signet] Rescan completed in           38932ms

This shows that the descriptors are being imported successfully, and even rescanning of the wallet has started. This rescan takes 39 secs to complete. And during this period only the call fails in bdk.

Observations

This only happens for fresh sync only. Syncing the DB with electrum and switching back to RPC doesn't cause this error. This is probably because the rescan period is very small for a pre-synced wallet, as bdk will only tell core to rescan from last synced height.

Rescanning a synced db, again with rpc in signet now produces a successful sync.

2023-02-10T07:25:13Z [coldcard-signet] Update existing descriptor: raw(76a9148a01aba7f7bfda8d3e0636d3136fd1b949e12f6788ac)#qg9uw2sw
2023-02-10T07:25:13Z [coldcard-signet] RescanFromTime: Rescanning last 11 blocks
2023-02-10T07:25:13Z [coldcard-signet] Rescan started from block 000000e6517d3797dd07b8aada82f5e09f56ab146ff37210b1012519ed493bb3...
2023-02-10T07:25:13Z [coldcard-signet] MarkUnusedAddresses: Detected a used keypool item at index 0, mark all keypool items up to this item as used
2023-02-10T07:25:13Z [coldcard-signet] AddToWallet 4996f1f9fbe4a3857d8d89326c0a8b929a56856f153f21a1a0a0c61ebc02db99  
2023-02-10T07:25:13Z [coldcard-signet] Rescan completed in               6ms
2023-02-10T07:25:13Z Received a POST request for /wallet/coldcard-signet from 127.0.0.1:34036
2023-02-10T07:25:13Z ThreadRPCServer method=getwalletinfo user=mempool
2023-02-10T07:25:13Z Received a POST request for /wallet/coldcard-signet from 127.0.0.1:34052
2023-02-10T07:25:13Z ThreadRPCServer method=listtransactions user=mempool
2023-02-10T07:25:13Z Received a POST request for /wallet/coldcard-signet from 127.0.0.1:34062
2023-02-10T07:25:13Z ThreadRPCServer method=listtransactions user=mempool
2023-02-10T07:25:13Z Received a POST request for /wallet/coldcard-signet from 127.0.0.1:34066
2023-02-10T07:25:13Z ThreadRPCServer method=listunspent user=mempool
2023-02-10T07:25:13Z Received a POST request for /wallet/coldcard-signet from 127.0.0.1:34072
2023-02-10T07:25:13Z ThreadRPCServer method=getblockchaininfo user=mempool
2023-02-10T07:25:13Z Received a POST request for /wallet/coldcard-signet from 127.0.0.1:34078
2023-02-10T07:25:13Z ThreadRPCServer method=getnetworkinfo user=mempool

This time rescan took 6 ms and it didn't throw the error.

Good News,

This was a tricky bug to catch, and even after identifying the call, I still don't exactly know what's causing the failure. But luckily, it's not a hard one to solve because the sync doesn't get affected by the error at all. The log above suggests that core carries on rescanning the wallet after importing all the keys, and just throws this os error in reply of importdescriptors (only iff the rescaning happens from genesis). We just happen to locally freak out by this os error and stop our sync.

Instead, we could just safely ignore this error and continue our sync process. I have tested this locally, and it works (patch incoming soon). No data loss happens as the failing call is a non-data type.

Proposed Fix

We could also be hitting some core's internal parallel HTTP request limits. As the only time this happens is with the sequence of
, Import a large number of descriptors followed by a Rescan from genesis. Maybe this can be potentially solved by fixing #841

So, for now the simplest thing to do is to just ignore this particular error for this particular call, and carry on the sync. And hopefully, with future improvements, this will go away. 😅

@rajarshimaitra rajarshimaitra added the bug Something isn't working label Feb 10, 2023
@tnull
Copy link
Contributor

tnull commented Feb 10, 2023

I don't this is a BDK specific bug. I had multiple users report the same issue when running LDK Node / LDK transaction-sync crate tests, which both rely on bitcoincore_rpc, imported via the bitcoind crate.

@rajarshimaitra
Copy link
Contributor Author

Yeah, it seemed like that to me too. Any specific thing that makes it go away for LDK? For me in BDK just ignoring the error and carrying on with sync worked..

@tnull
Copy link
Contributor

tnull commented Feb 15, 2023

Seems this might be connected to apoelstra/rust-jsonrpc#67, which might be fixed with jsonrpc 0.14.0, which has yet to land in rust-bitcoincore-rpc.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
Status: No status
Development

Successfully merging a pull request may close this issue.

3 participants