overreading #5

kumavis · 2016-11-16T05:14:30Z

what is the expected behavior where reader.read(size, cb) is called where size > the remaining readable data?

The text was updated successfully, but these errors were encountered:

kumavis · 2016-11-16T05:17:25Z

I would expect that it would give me the remainder and then drain/abort

kumavis · 2016-11-16T05:22:57Z

current behavior:

pull = require('pull-stream')
Reader = require('pull-reader')

pull(
  pull.values([new Buffer('hi')]),
  reader = Reader()
)

reader.read(4, function(err, result){
  console.log(err, result) // true, undefined
})

dominictarr · 2016-11-16T13:24:00Z

looking at the code what I think it will do is just fall through to ending.
I think that might be the right thing, because if you asked for length it's because you needed that much right? having less than that is a protocol error. If you can take any length, use reader.read(null, cb)

kumavis · 2016-11-16T18:53:38Z

having less than that is a protocol error.

Yes the main case I was that I wanted to see that there was not as much data as expected, and report an error upstream with a message representing that. What currently happens is the stream terminates without any message.

So I think there are two options:

read should end with an error
read should return the remainder

I think the latter is more generally useful, and can allow the consumer to check if the remaining value was incomplete. This allows someone to use reader.read to "read up to size" and not lose information if there was a partial remainder.

daviddias · 2016-11-16T19:25:07Z

The current behaviour seems about right, I tried it by adding another read call:

pull = require('pull-stream')
Reader = require('pull-reader')

pull(
  pull.values([new Buffer('hi')]),
  reader = Reader()
)

reader.read(4, function(err, result){
  console.log(err, result) // true, undefined
})

reader.read(2, function(err, result){
  console.log(err, result) // null, <Buffer 68 69>
})

and it is very similar to standard Node.js streams, where the design is, if you want to read X bytes, it will not return you any byte until it has X bytes buffered. It returned the bytes that it had (<X) then you would have to create a layer of caching yourself, because the stream can't emit the same data twice and doing this manually would cumbersome for wire protocols.

Would knowing the buffer size help solve your case?

kumavis · 2016-11-16T19:36:03Z

@diasdavid
I'm not really sure how the two reads affects things (do they both read from the start? does the first consume the data before the 2nd?). Anyways not sure how a second reader adds to the situation, seems to just complicate the example.

It returned the bytes that it had (<X) then you would have to create a layer of caching yourself

My assumption is that pull-reader does currently cache until it has the size. Maybe I'm misunderstanding this step. The issue is that when the source sends an abort signal, pull-reader does not:

flush its cache
or signal that there was a cache that did not meet the size requirements.

because the stream can't emit the same data twice

im not sure how this is relevant?

Would knowing the buffer size help solve your case?

my main goal here is better handling of edge cases where things did not behave as expected

daviddias · 2016-11-17T09:18:29Z

My assumption is that pull-reader does currently cache until it has the size

That is correct. That is also why you can't get a 'remainder' from a read operation as it is favourable that data only flows when the next action in the pipeline can do some action upon it.

The issue is that when the source sends an abort signal, pull-reader does not:

If the pull.values were a connection, the error that the connection was dropped should be propagated through the pull pipeline, you are not seeing this happening?

my main goal here is better handling of edge cases where things did not behave as expected

We might have a similar need, handling 'canceled operations' in JS is really hard because of the lack of context (like we have in Go), we've opened this issue https://github.com/ipfs/interface-ipfs-core/issues/58 and are currently exploring what is the best route.

kumavis · 2016-11-17T18:05:42Z

If the pull.values were a connection, the error that the connection was dropped should be propagated through the pull pipeline, you are not seeing this happening?

I see what you are saying about not being able to push data twice - it cant return the remainder and return an error (until the next read is requested).

But imagine the case where there is no error, just the flow ends.

pull(
  pullFile('./data'),
  reader
)

function readNextBytes(cb){
  reader.read(1024, cb)
}

Here we attempt to read chunks of 1kb of data at a time, but we lose the remainder when the file ends. Here are some ideas:

non-standard abort + remainder

// when `end` is `true`, any remaining data is passed on in `data`
// if reading from a 1025 byte file, reader would return:
// (null, 1024 byte)
// (true, 1 byte)
reader.read(1024, function(end, data){
  if (end instanceof Error) return handleError(end)
  handleData(data)
  if (end) allDone()
}

cache abort (if not an error)

// when abort signal is received, next read returns remaining data, following read returns abort
// error still takes priority over remainder
// if reading from a 1025 byte file, reader would return:
// (null, 1024 byte)
// (null, 1 byte)
// (true, null)
reader.read(1024, function(end, data){
  if (end instanceof Error) return handleError(end)
  if (end) allDone()
  if (data.length < 1024) return handleError( new Error('App Specific Error - Corrupted save fail') )
  handleData(data)
}

dominictarr · 2016-11-19T17:17:03Z

@kumavis I think you are correct. read(len, cb) should callback an error as the first arg, not true.
That error can have a message about how many bytes where read.

kumavis mentioned this issue Nov 16, 2016

Tests and guards for failure cases dignifiedquire/pull-length-prefixed#8

Closed

jacobheun mentioned this issue Jun 26, 2018

Initial development libp2p/js-libp2p-pnet#1

Merged

jacobheun mentioned this issue Jul 5, 2018

return an error when there are insufficient bytes #6

Closed

dominictarr added a commit that referenced this issue Jul 6, 2018

simpler fix for #5

7342395

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

overreading #5

overreading #5

kumavis commented Nov 16, 2016

kumavis commented Nov 16, 2016

kumavis commented Nov 16, 2016 •

edited

Loading

dominictarr commented Nov 16, 2016

kumavis commented Nov 16, 2016 •

edited

Loading

daviddias commented Nov 16, 2016

kumavis commented Nov 16, 2016 •

edited

Loading

daviddias commented Nov 17, 2016 •

edited

Loading

kumavis commented Nov 17, 2016

dominictarr commented Nov 19, 2016

overreading #5

overreading #5

Comments

kumavis commented Nov 16, 2016

kumavis commented Nov 16, 2016

kumavis commented Nov 16, 2016 • edited Loading

dominictarr commented Nov 16, 2016

kumavis commented Nov 16, 2016 • edited Loading

daviddias commented Nov 16, 2016

kumavis commented Nov 16, 2016 • edited Loading

daviddias commented Nov 17, 2016 • edited Loading

kumavis commented Nov 17, 2016

dominictarr commented Nov 19, 2016

kumavis commented Nov 16, 2016 •

edited

Loading

kumavis commented Nov 16, 2016 •

edited

Loading

kumavis commented Nov 16, 2016 •

edited

Loading

daviddias commented Nov 17, 2016 •

edited

Loading