multiread #13

dominictarr · 2013-02-24T10:36:32Z

An operation that is like a batch, but for gets, but without the keys being adjacent in order.

I mention this because at campjs @rvagg and I discussed where the boundry between userland and core should be, regarding issues like: #9 (comment)

We couldn't think of many features that you might want to do in the C binding, but we decided that we should evaluate them on a case-by-case basis for a bit, until it becomes clearer.

So this issue is mainly intended for discussion, although it may be useful.

hmm, so for the purposes of discussion, the api is

db.getAll(arrayOfKeys, cb)

Would an operation like this also benefit from crossing the c-js boundry only once,
passing an array each way?

The text was updated successfully, but these errors were encountered:

rvagg · 2013-02-24T10:47:59Z

Should benefit very much, I think @juliangruber had something similar in Leveled? We can do multi-threaded get()s at the moment so it's not all bad, but we have a limit to the number of threads.

It's interesting to think about this in relation to a range delete cause they could share a similar API.

db.getAll(array, cb)
db.delAll(array, cb)
db.delRange(options, cb) // where options is the same as for db.readStream()
db.getRange(options, cb) // where options is the same as for db.readStream()

Of course the getRange() on is a bit trickier cause of potential size problems, perhaps we should put an upper limit on it or at least enforce the use of a 'limit' property so at least they know what they are doing (they could put limit:99999 if they wanted but then at least it'd be explicit).

juliangruber · 2013-02-24T11:11:28Z

I had db#range(from, to, cb) and db#find(glob, cb), no multiget though. For those methods the performance gain was huge because iterators require so many js-c boundry crossings.

With a multiget the gain would increase with the numbers of keys given. I'm not sure yet about when this would pay off. Someone would have to do an implementation so we can see.

Oh, and I'm for overloading the db#get function for this.

dominictarr · 2013-02-24T22:51:25Z

getRange could just alias to iterator, with the limit set very large.
I'm opposed to adding safety features to leveldown, but am happy to have them in levelup.

see section about the "The Hole Hawg" in this article http://artlung.com/smorgasborg/C_R_Y_P_T_O_N_O_M_I_C_O_N.shtml

rvagg · 2014-05-12T02:57:51Z

ref #9

Marking as "help wanted" if someone wants to come up with an interesting solution for this that we may want to consider merging in. Again, unlikely to be something exposed at the LevelUP layer but might be nice to have down here.

matonga · 2019-01-02T15:59:14Z

Hi there,

I've implemented this functionality, it looks like this:

db = leveldown (...), db.open, etc...

db.map (keys, [options, ], function (error, values) {
    // whatever you want to do here
});

options are the same ones used by .get (... options, ...).

keys is an array of strings / buffers.
values is an array of strings / buffers / undefineds, returned in same order of supplied keys. Keys not found are left undefined in resulting values array, keys found are returned as buffers or strings (depending on supplied options or defaults).

In my use case I'm actually checking for pre-existing non-consecutive keys before insertion. I did some benchmarks with 1 million keys and found some interesting results:

If the keys are all pre-existing, .map returns values three times faster than using .get
If the keys don't exist, .map returns values about ten times faster. This is usually the case for me so I'm going to fork leveldown, but anyway... I have a question to see if I can submit a pull request:

I followed other methods implementation, so had to edit both abstract-leveldown and leveldown. Added these methods:

abstract-leveldown -> AbstractLevelDOWN.prototype.map (parameter validation, options defaults, call to this._map)
leveldown -> LevelDOWN.prototype._map (call to binding.map_do)
binding.cc -> MapWorker, NAPI_METHOD(map_do)

My question is if I should pull-request the two separate github projects (abstract-leveldown & leveldown) or should I modify the implementation to keep all changes inside levedown?

abstract-leveldown could polyfill ._map implementation when not found, by issuing ._get by hand for every supplied key instead (however this ~~would be~~ is a lot slower) (benchmarked it too).

Don't hesitate to ask me any question, english is not my native language and I may have screwed the explanation a bit 😄

vweevers · 2019-01-02T18:09:48Z

@matonga nice work!

With iterator.seek(), we've made it possible for "multiread" aka "batch get" to be implemented in userland too. Or indeed as an abstract-leveldown fallback, though at the moment leveldown is the only implementation that supports seek(). See Level/levelup#491 (comment):

I think, rather than a batch get, I prefer to officially expose iterators with a seek() method. This not only enables fetching multiple non-contiguous keys but also getting multiple ranges of keys. You seek, read some, seek again. Also you get backpressure and can do things like skip scans (for compound aka composite indexes).

Would be interesting to compare that against your native solution. I'm guessing yours will be faster because in theory it only has to cross the C++/JS boundary once. Is your code on GitHub?

vweevers · 2019-01-02T18:25:44Z

My question is if I should pull-request the two separate github projects (abstract-leveldown & leveldown) or should I modify the implementation to keep all changes inside levedown?

It's fine for a new feature to live in leveldown for a while, for people to play with. The same happened with iterator.seek(). The barrier for new abstract-leveldown features is fairly high: there must be a good use case (that can't be solved with existing features), a solid API and at least a few other implementations that can support it (in theory).

matonga · 2019-01-02T20:18:45Z

I'm guessing yours will be faster because in theory it only has to cross the C++/JS boundary once. Is your code on GitHub?

@vweevers that's right, I find it works faster, maybe because it crosses the C++/JS boundary once or maybe because there is less overhead (one Javascript call to map() vs. many calls to get() / seek()), or maybe it's a combination of both. I've just uploaded the code to GitHub:

matonga@b86b0a1

Any suggestions welcome. Guess I should write some test/map-test.js for a proper pull request?

The barrier for new abstract-leveldown features is fairly high: there must be a good use case (that can't be solved with existing features), a solid API and at least a few other implementations that can support it (in theory).

Just in case, I've moved all code to leveldown.

vweevers · 2019-01-02T21:21:00Z

matonga@b86b0a1

Any suggestions welcome. Guess I should write some test/map-test.js for a proper pull request?

Looks like a viable solution. We can discuss details in the PR :) It's up to you whether you want to write a test before opening a PR. It may help us to understand the intended behavior. It would be great if you can share your benchmarks as well (code and results). Thanks for taking this on!

See Level/leveldown#13 for a description See https://github.com/matonga/leveldown for a LevelDB implementation of multiread

vweevers · 2021-09-30T21:15:08Z

Level/community#101

matonga · 2021-10-01T20:47:19Z

@vweevers thank you! good to see a multiread implementation being merged into leveldown and related repos at last.

rvagg added the help wanted label May 12, 2014

rvagg mentioned this issue May 14, 2014

Get a range in one go #98

Closed

vweevers mentioned this issue Sep 21, 2017

Batch get option Level/levelup#491

Closed

vweevers added the enhancement New feature or request label Dec 31, 2018

matonga pushed a commit to matonga/leveldown that referenced this issue Jan 2, 2019

Proposal for multiread implementation (Level#13)

b86b0a1

matonga pushed a commit to matonga/leveldown-hyper that referenced this issue Jun 19, 2019

Proposal for multiread implementation into HyperLevelDB

97b6a25

See Level/leveldown#13 for a description See https://github.com/matonga/leveldown for a LevelDB implementation of multiread

MeirionHughes mentioned this issue May 6, 2021

Proposal: add map method to abstract-down Level/community#100

Closed

vweevers closed this as completed Sep 30, 2021

gogoout mentioned this issue Dec 15, 2021

Feature Request: multiGet/multiRead kriszyp/lmdb-js#113

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

multiread #13

multiread #13

dominictarr commented Feb 24, 2013

rvagg commented Feb 24, 2013

juliangruber commented Feb 24, 2013

dominictarr commented Feb 24, 2013

rvagg commented May 12, 2014

matonga commented Jan 2, 2019

vweevers commented Jan 2, 2019

vweevers commented Jan 2, 2019

matonga commented Jan 2, 2019 •

edited

Loading

vweevers commented Jan 2, 2019 •

edited

Loading

vweevers commented Sep 30, 2021

matonga commented Oct 1, 2021

multiread #13

multiread #13

Comments

dominictarr commented Feb 24, 2013

rvagg commented Feb 24, 2013

juliangruber commented Feb 24, 2013

dominictarr commented Feb 24, 2013

rvagg commented May 12, 2014

matonga commented Jan 2, 2019

vweevers commented Jan 2, 2019

vweevers commented Jan 2, 2019

matonga commented Jan 2, 2019 • edited Loading

vweevers commented Jan 2, 2019 • edited Loading

vweevers commented Sep 30, 2021

matonga commented Oct 1, 2021

matonga commented Jan 2, 2019 •

edited

Loading

vweevers commented Jan 2, 2019 •

edited

Loading