query interface #9

pgte · 2017-07-05T15:47:22Z

While trying to adapt a datastore into a Leveldown interface, I came across some impedance. Mind you that I'm new to the datastore eco-system, so I may be very wrong.

The first part of it is the fact that a query returns a pull stream. While I love pull-streams, transforming them into into a Leveldown iterator interface is not trivial as far as I know. Here, you may argue that the pull-stream interface is superior, but my guess is that very few developers are familiar with it. Also, there are other alternatives that are more standard, ranging from the Node streams to ES6 iterators.

The second part (and to me, the one representing more impedance) is the query options. The query options, with the exception of prefix, imply providing a function, which is not easily (or not at all) translatable to a database query. This, I guess, forces implementations to do a full scan a filter data in memory, which may be terrible performance-wise.

One option which I like would be to provide a declarative querying interface similar to the Leveldown one, which then allows us to translate these into back-end options on 99% of the cases.

dignifiedquire · 2017-07-05T16:15:02Z

@pgte I am confused I already did the work of writing a generic level interface for datastore that does all this work here. http://github.com/ipfs/js-datastore-level it accepts any leveldown compatible implementation

dignifiedquire · 2017-07-05T16:15:57Z

the conversion from iterator to pull-stream is done here: https://github.com/ipfs/js-datastore-level/blob/master/src/index.js#L90 it's a bit tricky but works quite well as far as I understand

dignifiedquire · 2017-07-05T16:16:42Z

In terms of the options that we support, this is a 1:1 port of the interfaces go provides, so if we want to change anything there we should consider those settings first.

dignifiedquire · 2017-07-05T16:18:28Z

While trying to adapt a datastore into a Leveldown interface,

Oh I am sorry I miss understood you are trying to go the other way around, I haven't looked into that yet.

dignifiedquire · 2017-07-05T16:22:44Z

The main reason I ended up not using the leveldown interface is two fold.

it is missing some options that go implements that I wanted to support and we are using in the dht, especially prefix
We already have one lazy iterative interface in the code base which is pull-streams and the datastores should fit into here as well as possible. Using pull-streams for this seemed the natural way to go, as I would otherwise in modules like the dht, have to adapt the iterator to a pull stream anyway

dignifiedquire · 2017-07-05T16:23:23Z

some background for datastore:

pgte · 2017-07-05T16:23:26Z

@dignifiedquire that's a great example. Here you mostly have to create a full iterator that iterates over the entire DB snapshot, while filtering it in memory:
https://github.com/ipfs/js-datastore-level/blob/master/src/index.js#L96-L100
It's not efficient, wouldn't you say?

dignifiedquire · 2017-07-05T16:25:57Z

It's not great, but leveldown doesn't expose the filtering in the database anyway in a way that I need, so not seeing how this could be improved.

dignifiedquire · 2017-07-05T16:28:40Z

Namely it does not allow for doing any sort of key based filtering directly, without pulling all entries out

pgte · 2017-07-05T16:32:07Z

@dignifiedquire yeah, it allows for key partitioning, and range queries. I understand that's very limited, but it caters to most use cases I've seen using a kv-store, you just have to decide wisely about the key partitioning / subleveling and perhaps implementing materialised views.
I thought the datastore interface was meant to those cases.
What use cases is interface-datastore trying to solve?

dignifiedquire · 2017-07-05T20:38:07Z

Abstract storage layers including but not limited to file system, key value stores and sql databases. With a way to combine all those into a path like namespaces. Similar to the goals described here

In addition one important goal is to support all operations that ipfs needs to achieve feature parity with go-ipfs and being able to read and write repos the same way go-ipfs does.

pgte · 2017-07-06T09:42:44Z

My opinion is that the query interface is perhaps too generic to enable any efficient implementation.
I propose that we enable some form of query options that allows range queries upon keys.

Without this, for instance, I'm not able to translate a levelDB query into a datastore query in a way that is efficient during runtime..

Gozala · 2020-05-21T18:15:29Z

The second part (and to me, the one representing more impedance) is the query options. The query options, with the exception of prefix, imply providing a function, which is not easily (or not at all) translatable to a database query. This, I guess, forces implementations to do a full scan a filter data in memory, which may be terrible performance-wise.

This is also something I'm running into in an attempt to move js-ipfs into shared worker (ipfs/js-ipfs#3022). Problem is you can not pass functions across the threads so basically you'd have to send all the data from worker to the main thread and then filter it out there. I think it would be better to represent query as data and provide more complicating filtering as an exercise to the user. That way

Query could be optimized for cases that @pgte mentioned and for multithread use cases.
This would work better with ipfs-http-client so that host can filter data without passing it onto client.
Generally fits better systems that cross language boundaries.

daviddias added the status/ready Ready to be worked label Aug 25, 2018

Gozala mentioned this issue May 26, 2020

Generalizing data store interface #39

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

query interface #9

query interface #9

pgte commented Jul 5, 2017

dignifiedquire commented Jul 5, 2017

dignifiedquire commented Jul 5, 2017

dignifiedquire commented Jul 5, 2017

dignifiedquire commented Jul 5, 2017

dignifiedquire commented Jul 5, 2017

dignifiedquire commented Jul 5, 2017

pgte commented Jul 5, 2017

dignifiedquire commented Jul 5, 2017

dignifiedquire commented Jul 5, 2017

pgte commented Jul 5, 2017

dignifiedquire commented Jul 5, 2017

pgte commented Jul 6, 2017

Gozala commented May 21, 2020

query interface #9

query interface #9

Comments

pgte commented Jul 5, 2017

dignifiedquire commented Jul 5, 2017

dignifiedquire commented Jul 5, 2017

dignifiedquire commented Jul 5, 2017

dignifiedquire commented Jul 5, 2017

dignifiedquire commented Jul 5, 2017

dignifiedquire commented Jul 5, 2017

pgte commented Jul 5, 2017

dignifiedquire commented Jul 5, 2017

dignifiedquire commented Jul 5, 2017

pgte commented Jul 5, 2017

dignifiedquire commented Jul 5, 2017

pgte commented Jul 6, 2017

Gozala commented May 21, 2020