Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

byob reader support for Blob.stream() #47993

Closed
jimmywarting opened this issue May 13, 2023 · 13 comments · Fixed by #49713
Closed

byob reader support for Blob.stream() #47993

jimmywarting opened this issue May 13, 2023 · 13 comments · Fixed by #49713

Comments

@jimmywarting
Copy link

Version

all

Platform

No response

Subsystem

No response

What steps will reproduce the bug?

new Blob([1,2]).stream().getReader({mode: 'byob'})

How often does it reproduce? Is there a required condition?

every time

What is the expected behavior? Why is that the expected behavior?

get a byob reader back

What do you see instead?

Uncaught:
TypeError [ERR_INVALID_ARG_VALUE]: The argument 'stream' must be a byte stream. Received ReadableStream { locked: false, state: 'readable', supportsBYOB: false }
    at __node_internal_captureLargerStackTrace (node:internal/errors:490:5)
    at new NodeError (node:internal/errors:399:5)
    at setupReadableStreamBYOBReader (node:internal/webstreams/readablestream:2149:11)
    at new ReadableStreamBYOBReader (node:internal/webstreams/readablestream:912:5)
    at ReadableStream.getReader (node:internal/webstreams/readablestream:350:12) {
  code: 'ERR_INVALID_ARG_VALUE'
}

Additional information

Works in Firefox, Deno

@debadree25
Copy link
Member

Hey! so I was trying to look around to solve this issue and while writing tests for it I noticed a weird behavior of node in comparison to browsers:

take the following script:

async function fn() {
  const b = new Blob([1, 2, 3, 4, 5, 5, 6, 6]).stream().getReader();

  const chunks = [];

  while (true) {
    const { done, value } = await b.read();
    if (done) break;
    chunks.push(value);
    console.log('read value', value);
  }

  console.log(chunks);
}

await fn();

If executed in browsers (tested on chrome and firefox) the output looks like this:

Screenshot 2023-05-14 at 7 00 26 PM

Whereas when trying this with node v20.1.0

debadreechatterjee@Debadree-MacBook-Pro node % node test.mjs
read value Uint8Array(1) [ 49 ]
debadreechatterjee@Debadree-MacBook-Pro node %

it appears the promise hangs which leads to the process exiting, is this a bug or is my script something wrong

@debadree25
Copy link
Member

debadree25 commented May 14, 2023

Ok interestingly if we call blob.arrayBuffer before creating the stream this works fine (edit: this also seems flakey)

@KhafraDev
Copy link
Member

Sounds like a bug cc @jasnell

The fix for this issue should be as simple as setting type: 'bytes' here

return new lazyReadableStream({

@debadree25
Copy link
Member

Not really we have to change the way we respond too no? I was trying out something like this

I was trying something like this

diff --git a/lib/internal/blob.js b/lib/internal/blob.js
index ee8e1c7581..47face7a93 100644
--- a/lib/internal/blob.js
+++ b/lib/internal/blob.js
@@ -321,6 +321,7 @@ class Blob {
 
     const reader = this[kHandle].getReader();
     return new lazyReadableStream({
+      type: 'bytes',
       start(c) {
         // There really should only be one read at a time so using an
         // array here is purely defensive.
@@ -353,7 +354,20 @@ class Blob {
             return;
           }
           if (buffer !== undefined) {
-            c.enqueue(new Uint8Array(buffer));
+            if (c.byobRequest) {
+              const byobRequestView = c.byobRequest.view;
+              const byteLength = buffer.byteLength;
+              const viewByteLength = byobRequestView.byteLength;
+              const viewByteOffset = byobRequestView.byteOffset;
+              for (let i = 0; i < byteLength;) {
+                const chunk = new Uint8Array(buffer.slice(i, i + viewByteLength));
+                byobRequestView.set(chunk, viewByteOffset);
+                i += viewByteLength;
+                c.byobRequest.respond(viewByteLength);
+              }
+            } else {
+              c.enqueue(new Uint8Array(buffer));
+            }
           }
           pending.resolve();
         });

but this causes some error with detached array buffers which i dont yet understand 😅

@jimmywarting
Copy link
Author

When you enqueue a uint8array then the underlying ArrayBuffer gets detached.
And you can no longer use that ArrayBuffer for anything else.

ab = new Uint8Array([97, 98]).buffer

chunk1 = new Uint8Array(ab, 0, 1)
chunk2 = new Uint8Array(ab, 1, 1)

rs = new ReadableStream({
  type: 'bytes',
  start(ctrl) {
    console.log(chunk1.byteLength, ab.byteLength) // 1, 2
    ctrl.enqueue(chunk1)
    console.log(chunk1.byteLength, ab.byteLength) // 0, 0
    
    // if you now try to enqueue chunk2 then it will fail
    // ctrl.enqueue(chunk2)
    ctrl.close()
  }
})

into = new Uint8Array(2)
rs.getReader({mode: 'byob'}).read(into)

It's as if you have transfered the arraybuffer to another thread using postMessage(ab, [ ab ])
the arraybuffer byteLengths gets set to 0

@debadree25
Copy link
Member

Ah understood so what could be a work around for the byob mode? respondWithNewView maybe?

@jimmywarting
Copy link
Author

Haven't investigated the source code so much.
but i had one issue in earlier NodeJS version where it first read everything as a hole arrayBuffer
and then the ReadableStream would just slice them into smaller views with subarray so it was totally unexpected when i receive a byteOffset other then 0
and that each chunk.buffer where the same instance.
that was abnormal

so if you are for some reason reusing the same ArrayBuffer twice then that might be an issue.

anyhow. i'm not sure if you should enqueue a uint8array anyway. i suppose you are going to use controller.byobRequest.respond(bytesRead)

the stream spec have a good turtorial on how to use files + byte streams here
https://streams.spec.whatwg.org/#example-rbs-pull

const fs = require("fs").promises;
const DEFAULT_CHUNK_SIZE = 1024;

function makeReadableByteFileStream(filename) {
 let fileHandle;
 let position = 0;

  return new ReadableStream({
    type: "bytes",

    async start() {
      fileHandle = await fs.open(filename, "r");
    },

    async pull(controller) {
      // Even when the consumer is using the default reader, the auto-allocation
      // feature allocates a buffer and passes it to us via byobRequest.
      const v = controller.byobRequest.view;

      const { bytesRead } = await fileHandle.read(v, 0, v.byteLength, position);
      if (bytesRead === 0) {
        await fileHandle.close();
        controller.close();
        controller.byobRequest.respond(0);
      } else {
        position += bytesRead;
        controller.byobRequest.respond(bytesRead);
      }
    },

    cancel() {
      return fileHandle.close();
    },

    autoAllocateChunkSize: DEFAULT_CHUNK_SIZE
  });
}

@debadree25
Copy link
Member

Ah got it! i think responding multiple times by slicing the array buffer wont work, i think this would need changes on c++ side too, trying to investigate thank you for the inputs!!

@jimmywarting
Copy link
Author

jimmywarting commented May 14, 2023

I don't see why slicing the arraybuffer would not work. you are creating a copy after all. but doing so this way would probably also indicate that you probably are also are doing something wrong in the first place... it would be better to read into an existing arraybuffer view instead

@debadree25
Copy link
Member

Oh ok @KhafraDev was right it seems to be as simple as adding type: 'bytes' heh I went of on unnecessary tangent but it seems the promise hang issue is what needs to be looked into

@jimmywarting
Copy link
Author

Oh ok @KhafraDev was right it seems to be as simple as adding type: 'bytes' heh I went of on unnecessary tangent but it seems the promise hang issue is what needs to be looked into

i tough so too.
i don't know how the stream spec works exactly. but maybe if you ctrl.enqueue() something in a bytes stream
then the stream will do the controller.byobRequest.respond(x) automatically for you (internally). as well as setting the byobRequestView to what you enqueued.

@debadree25
Copy link
Member

Hey so I opened a PR to fix the hanging issue I pointed out in #47993 (comment) here #48232 but as for adding 'bytes' support there seems like we have another issue #48233

nodejs-github-bot pushed a commit that referenced this issue Jun 11, 2023
Refs: #47993 (comment)
PR-URL: #48232
Reviewed-By: James M Snell <jasnell@gmail.com>
Reviewed-By: Matteo Collina <matteo.collina@gmail.com>
Reviewed-By: Benjamin Gruenbaum <benjamingr@gmail.com>
RafaelGSS pushed a commit that referenced this issue Jul 3, 2023
Refs: #47993 (comment)
PR-URL: #48232
Reviewed-By: James M Snell <jasnell@gmail.com>
Reviewed-By: Matteo Collina <matteo.collina@gmail.com>
Reviewed-By: Benjamin Gruenbaum <benjamingr@gmail.com>
Ceres6 pushed a commit to Ceres6/node that referenced this issue Aug 14, 2023
Refs: nodejs#47993 (comment)
PR-URL: nodejs#48232
Reviewed-By: James M Snell <jasnell@gmail.com>
Reviewed-By: Matteo Collina <matteo.collina@gmail.com>
Reviewed-By: Benjamin Gruenbaum <benjamingr@gmail.com>
Ceres6 pushed a commit to Ceres6/node that referenced this issue Aug 14, 2023
Refs: nodejs#47993 (comment)
PR-URL: nodejs#48232
Reviewed-By: James M Snell <jasnell@gmail.com>
Reviewed-By: Matteo Collina <matteo.collina@gmail.com>
Reviewed-By: Benjamin Gruenbaum <benjamingr@gmail.com>
debadree25 added a commit to debadree25/node that referenced this issue Sep 19, 2023
@debadree25

This comment was marked as outdated.

nodejs-github-bot pushed a commit that referenced this issue Sep 22, 2023
Fixes: #47993
PR-URL: #49713
Reviewed-By: Benjamin Gruenbaum <benjamingr@gmail.com>
Reviewed-By: Matteo Collina <matteo.collina@gmail.com>
ruyadorno pushed a commit that referenced this issue Sep 28, 2023
Fixes: #47993
PR-URL: #49713
Reviewed-By: Benjamin Gruenbaum <benjamingr@gmail.com>
Reviewed-By: Matteo Collina <matteo.collina@gmail.com>
alexfernandez pushed a commit to alexfernandez/node that referenced this issue Nov 1, 2023
Fixes: nodejs#47993
PR-URL: nodejs#49713
Reviewed-By: Benjamin Gruenbaum <benjamingr@gmail.com>
Reviewed-By: Matteo Collina <matteo.collina@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants