Use p2p to fetch blocks and headers #1516

andrewtoth · 2023-02-06T04:52:42Z

Uses the p2p interface to fetch block hashes and headers much more efficiently. Fetches headers 2k at a time, and fetches blocks in chunks of 10.

Syncs the first 200k or so blocks much faster, but haven't benchmarked the whole chain. I don't suspect it will be much more gain for the whole chain since most of the time will be spent indexing to the db, and blocks are already being fetched in the background. However, this is a much more efficient way of fetching this data.

Still have to fix tests because they don't support connecting via p2p yet.

Closes https://github.com/casey/ord/issues/1502.

andrewtoth · 2023-02-06T14:53:12Z

Might fix issues related to https://github.com/casey/ord/issues/1377, since those look like the RPC threads are falling over from too many requests.

I haven't fully groked the indexing logic for fetching raw transactions, but if we are doing a lot of getrawtransaction RPC calls it might be worth using REST interface for that. Benchmarking a getrawtransaction RPC vs a REST binary request for a large tx shows REST request is about 2x faster. But, using REST interface does require setting rest=1 in the config.

casey · 2023-02-06T22:54:22Z

This is awesome! Keep in mind that https://github.com/casey/ord/issues/1364 will wind up making sync faster.

Some thoughts!

I don't want to make things much more complicated, so keeping an initial PR as simple and focused as possible would be great.
We need to make sure that they must use a trusted BTC node, since ord doesn't enforce consensus.
All our tests run against our own mocked bitcoind, which is in test-bitcoincore-rpc. This is required to keep tests fast. Spinning up a bitcoind instances is slow, and tests are in our edit-test-debug cycle, so they must stay fast. As long as they stay fast and comprehensive, the method we use for testing doesn't matter, but one way to keep them fast would be to add P2P protocol support to test-bitcoincore-rpc. It only needs the minimum for the tests to work, so doesn't have to have comprehensive support for the full protocol.
I didn't even know that bitcoind had a rest interface. Is it documented anywhere? How is it different from the JSON RPC interface? I love rest and prefer it over JSON RPC now that I've had experience with both.

andrewtoth · 2023-02-06T23:20:19Z

Thanks @casey. This will make syncing faster even when skipping txs, since it fetches headers 2k at a time instead of incrementally.

Sorry for the REST tangent, that can be a follow-up.
Right. I suppose we can just use the same address as rpc and just specify the p2p port as an option.
Ok, will give that a shot.
REST is documented here. I have an implementation that we can pull in as a cargo crate here. But, that uses async/await so not sure if it will be useful in this project. REST is unauthenticated and doesn't have to go through the json parsing, so is much faster. It can also reply with raw binary data which is faster to serialize and smaller to send over the wire. But it also only shares public data, so we can't use it for things like the wallet. If rpc,rest,zmq: faster getblock, NotifyBlock and rest_block by reading raw block bitcoin/bitcoin#26415 was merged then I think that would be an even better candidate to fetch blocks than p2p, but the requirement to deserialize the block before serializing back into binary makes it slower than using the p2p endpoint right now. We can query it for utxo set data as well, but not sure what your ideas are for Re-enable skipping transactions below first inscription height #1364.

andrewtoth · 2023-02-07T21:32:26Z

src/index.rs

+        .first()
+        .unwrap()
+        .to_string(),
+    };


Not sure what the best way to parse the url is. By default it is 127.0.0.1:8332/wallet/ord, which doesn't parse as a url, and in tests it is http://127.0.0.1:... which won't parse with a split(':'). Maybe regex?
Maybe pass --rpc-port as an option instead of --rpc-url? But that would be a breaking change.

andrewtoth · 2023-02-07T21:36:07Z

@casey I think this is ready for a closer look.

casey · 2023-02-08T23:29:37Z

@andrewtoth Awesome, thank you! Putting it on the project backlog so it doesn't fall off our radar.

jpistone · 2023-02-09T07:51:18Z

WOW thank you! I was ticking <1 block per second. Found this and pulled your changes, now at ~50-100 per second. This should definitely be a top priority as most questions on the discord are about this right now (3 days - 2 weeks to index is rough).

Quick note: commit seems to happen every 5k blocks & takes about 3 minutes on my machine (I believe that's what was causing the json-rpc errors before this patch). This may confuse users when they see the block count hang for some time - I killed & restarted indexing myself a few times thinking that it had frozen.

dude513 · 2023-02-12T07:59:44Z

After 16 hours the index is at block 420001 and moving very slowly, no errors/crashes yet.

I am on p2p-indexing branch git cloned from andrewtoth/ord. @andrewtoth

andrewtoth · 2023-02-12T15:35:08Z

This has got a lot of successful testing from users in #1648.

dude513 · 2023-02-12T17:02:49Z

This has got a lot of successful testing from users in #1648.

With rest=1 and on a speedup-improvements branch I got 99% of all blocks in a few minutes.
I guess last 9000 blocks will be the most satisfying. :)

andrewtoth · 2023-02-14T18:16:24Z

One user had set maxuploadtarget to something too low in their bitcoin.conf, and so this disallowed the p2p interface from serving everything it needs to. We should advise users to add whitebind=127.0.0.1:8333 to their config files.

veryordinally · 2023-02-15T12:21:32Z

@casey We should definitely consider this, did an initial review and test.

veryordinally

Reviewed conceptually and high-level flow and logic - looks good to me and ready for more detailed review.

casey · 2023-02-15T19:00:23Z

I'm sorry for taking so long to collect my thoughts on this! Bitcoin's RPC and P2P interfaces are very different. The P2P interfaces is an untrusted interface used between full nodes on the bitcoin network. The RPC interfaces is an authenticated interface between a client and a trusted node. Using the P2P interface for applications where the node must be trusted is dicey. You can use it to directly connect to a trusted node, but you can also use it to connect to a totally untrusted node. I think that expressing this subtle difference to our users would be hard, and that lots of people, because running a full node is challenging, would try to connect to some random node, and have their security degraded as a result.

Also #1364 fixes the same issue as this would, namely, speeding up initial sync, and is a very clean and low-impact chain that doesn't introduce another connection to manage. So, I would like to see that implemented first, before deciding if also using P2P or REST is a good idea.

Also, keeping ord simple to configure is important (The degens are dying out there 😆) and every new outgoing connection introduces code, configuration, and maintainence complexity.

Sorry for not figuring this out sooner, it's only been in the last few days that I've been able to sit down and review PRs in detail, and previously I didn't give it enough attention.

Let's make sure we nail down the following in an issue before anyone writes a big PR:

What is the problem we're trying to solve or the benefit we're trying to eliminate?
What is the best design?
What is the testing strategy?

Sorry to close! We're still figuring out this whole massive open source project thing ourselves!

andrewtoth · 2023-02-15T20:02:43Z

@casey disappointing conclusion. I disagree with your reasoning.

This requires no other configuration options. It works out of the box. It's been tested by many users already without problems. The only configuration issue was someone who had maxuploadtarget already set, which is obviously an advanced configuration option.
This patch would require using the same node as the RPC connection. It cannot be used to connect to a different node because only the port is configurable, not the host.
It is much more efficient for all the reasons stated in the initial issue #1502 . Blocks are fetched directly from disk without serialization steps, and can be requested in batches. It puts less stress on bitcoind and is faster.
Without this change, #1364 would be making RPC requests on multiple threads. This can overload the RPC workqueue and cause disconnections.
You can see the issues with fetching blocks via RPC in the updater code. There is a lot of code that has retries, and users are complaining that it aborts after retrying forever.

kalibox · 2023-02-18T19:49:37Z

Does anyone know how much space ord index files take and where they're located?

andrewtoth force-pushed the p2p-indexing branch from 6f014e8 to 9ee4c87 Compare February 6, 2023 05:01

andrewtoth force-pushed the p2p-indexing branch from 9ee4c87 to e690f4c Compare February 7, 2023 21:23

andrewtoth marked this pull request as ready for review February 7, 2023 21:23

andrewtoth force-pushed the p2p-indexing branch from e690f4c to 90de650 Compare February 7, 2023 21:31

andrewtoth commented Feb 7, 2023

View reviewed changes

andrewtoth force-pushed the p2p-indexing branch from 90de650 to 963fc75 Compare February 7, 2023 21:35

andrewtoth force-pushed the p2p-indexing branch 4 times, most recently from 19527e3 to 0dff7ed Compare February 7, 2023 22:34

andrewtoth mentioned this pull request Feb 7, 2023

Possible timeout to bitcoin core on inscribe #1566

Open

This was referenced Feb 8, 2023

RPC error: transport error: HTTP response too short: length 0, needed 12 #1377

Closed

Ord index terminating every few thousand blocks #813

Closed

andrewtoth force-pushed the p2p-indexing branch from 0dff7ed to 0220355 Compare February 9, 2023 16:00

andrewtoth added 3 commits February 9, 2023 11:06

Use p2p to fetch blocks and headers

5f869a7

Add mock p2p server

b93464e

Fix tests for p2p

9a831a8

andrewtoth force-pushed the p2p-indexing branch from 0220355 to 9a831a8 Compare February 9, 2023 16:07

This was referenced Feb 10, 2023

Use REST interface for calling get_raw_transaction #1636

Closed

ord index is very slow #1648

Closed

dude513 mentioned this pull request Feb 12, 2023

index.redb download link #1701

Open

raphjaph mentioned this pull request Feb 13, 2023

"ord --index-sats index" is incredibly slow #1722

Closed

veryordinally reviewed Feb 15, 2023

View reviewed changes

casey closed this Feb 15, 2023

andrewtoth deleted the p2p-indexing branch February 15, 2023 19:55

andrewtoth restored the p2p-indexing branch February 16, 2023 01:39

This was referenced Oct 30, 2023

Performance Tracking Issue #2613

Open

P2P indexing #2614

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Use p2p to fetch blocks and headers #1516

Use p2p to fetch blocks and headers #1516

andrewtoth commented Feb 6, 2023 •

edited

Loading

andrewtoth commented Feb 6, 2023 •

edited

Loading

casey commented Feb 6, 2023

andrewtoth commented Feb 6, 2023 •

edited

Loading

andrewtoth Feb 7, 2023 •

edited

Loading

andrewtoth commented Feb 7, 2023

casey commented Feb 8, 2023

jpistone commented Feb 9, 2023

dude513 commented Feb 12, 2023

andrewtoth commented Feb 12, 2023

dude513 commented Feb 12, 2023

andrewtoth commented Feb 14, 2023

veryordinally commented Feb 15, 2023

veryordinally left a comment

casey commented Feb 15, 2023

andrewtoth commented Feb 15, 2023

kalibox commented Feb 18, 2023

Use p2p to fetch blocks and headers #1516

Use p2p to fetch blocks and headers #1516

Conversation

andrewtoth commented Feb 6, 2023 • edited Loading

andrewtoth commented Feb 6, 2023 • edited Loading

casey commented Feb 6, 2023

andrewtoth commented Feb 6, 2023 • edited Loading

andrewtoth Feb 7, 2023 • edited Loading

Choose a reason for hiding this comment

andrewtoth commented Feb 7, 2023

casey commented Feb 8, 2023

jpistone commented Feb 9, 2023

dude513 commented Feb 12, 2023

andrewtoth commented Feb 12, 2023

dude513 commented Feb 12, 2023

andrewtoth commented Feb 14, 2023

veryordinally commented Feb 15, 2023

veryordinally left a comment

Choose a reason for hiding this comment

casey commented Feb 15, 2023

andrewtoth commented Feb 15, 2023

kalibox commented Feb 18, 2023

andrewtoth commented Feb 6, 2023 •

edited

Loading

andrewtoth commented Feb 6, 2023 •

edited

Loading

andrewtoth commented Feb 6, 2023 •

edited

Loading

andrewtoth Feb 7, 2023 •

edited

Loading