-
Notifications
You must be signed in to change notification settings - Fork 160
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Stream Reader: event to detect first byte of each chunk added to internal buffer #1126
Comments
The streams API for non-byte-streams treats chunks as atomic units, so there's no concept of a "first byte". Eventually I expect fetch will use byte streams instead, but even then, the "time for first byte" will just be the same as "time for first chunk". I don't know about other browsers, but Chrome we handle network input in chunks anyway, so there really isn't a concept of "first byte" distinct from "first chunk" (despite what the resource timing API may imply). What this means for measuring throughput is that |
@ricea thanks for the quick response! Yes, the reason of this issue is actually to point out the lack of an event signaling the start of the chunk transfer via network for each chunk separately. Without such information correct throughput measurement seems impossible. Let me provide a minimal self-contained code example to showcase the problem: Node.js script to produce chunked-transfer data and serve the index.html below const http = require('http');
const fs = require('fs');
let index = '';
fs.readFile('index.html', (err, data) => {
if (err) {
throw err;
}
index = data.toString();
});
const hostname = '127.0.0.1';
const port = 3000;
async function produceData() {
return new Promise(resolve => {
setTimeout(resolve.bind(this, Buffer.alloc(1024)), 100);
});
}
const server = http.createServer(async (req, res) => {
res.statusCode = 200;
if (req.url !== '/data') {
res.setHeader('Content-Type', 'text/html');
res.end(index)
return;
}
res.setHeader('Content-Type', 'application/octet-stream');
res.setHeader('Transfer-Encoding', 'chunked');
for (let iter = 0; iter < 100; iter++) {
const chunk = await produceData();
res.write(chunk);
}
res.end();
});
server.listen(port, hostname, () => {
console.log(`Server running at http://${hostname}:${port}/`);
}); HMTL (index.html) <!DOCTYPE html>
<html lang="en">
<head>
<title>fetch api - stream reader - throughput check</title>
</head>
<body>
please open dev tools
<script>
fetch('./data')
.then(response => response.body)
.then(body => {
const reader = body.getReader();
let timeMark = Date.now();
let timeSum = 0;
let byteSum = 0;
let chunkCount = 0;
function pump() {
return reader.read().then(({ done, value }) => {
if (done) {
console.log(`got all chunks. Prize question: what is the actual network throughput? ${byteSum / (timeSum / 1000)} bytes per second does not seem right!`)
return;
}
console.log(`got ${++chunkCount}. chunk with ${value.byteLength} bytes, in ${Date.now() - timeMark} ms`);
byteSum += value.byteLength;
timeSum += (Date.now() - timeMark);
timeMark = Date.now();
return pump();
});
}
pump();
})
</script>
</body>
</html> If you start above node script and navigate to
So, did i misunderstood something or is it simply impossible to measure correct network throughput with Side note: For more clarity i've changed the issue title and desired event name to 'chunkTransferStarted'. |
It's worse than that: it's impossible to measure correct network throughput in the browser at all. The browser cannot distinguish between slowness caused by the network and slowness caused by the origin server. |
The reason for slowness is not an issue. It can be even mixed: slow production at the source AND network limitations on the way to the client. But what we really need is to measure the actual throughput on the client during transfer without the "idle" times between the chunks. |
The browser has no way to distinguish between "idle" and "slow". They look the same. |
As far as I know, this is indeed impossible. I work for a company that builds online video player solutions. In recent years, there's a huge interest in the industry for low-latency live streaming. In such streams, the origin server announces the availability of the next audio/video segment before that segment is fully complete. A low-latency player can already send the request for that segment and start downloading it while it is still being generated. However, this makes it difficult to do accurate network bandwidth estimations (for adaptive bitrate switching). The player is no longer continuously downloading at the full "line speed" of its network, instead it will receive "bursts" of data as the segment is being generated. A naive implementation that does not take these "bursts" into account would conclude that the bandwidth estimate is always less than or equal to the segment's bitrate. That is: if the player is downloading a 2 Mbps video segment over a 10 Mbps link, it would incorrectly estimate that the network bandwidth is 2 Mbps, and never attempt to switch up to a higher video quality (with a higher bitrate). This makes for a poor viewer experience. The state-of-the-art is to try to detect which chunks were received without any delay between them (i.e. are part of the same "burst"), and only estimate the bandwidth across those chunks. For example, ACTE does this. From their paper:
I don't know what your specific use case is, but perhaps it's close enough so that you can borrow some ideas from low-latency video streaming and ACTE? 🙂 |
@ricea that's was the motivation for raising this issue. On client side it should be possible to detect if transmission is ongoing or idle. My hope (and recommendation/suggestion) is that this missing piece in form of an event (see below) will be added to the Streams spec. @MattiasBuelens Thank you for confirming the current (not very satisfying) situation. Our use case is exactly the one you have mentioned ;) The group I’m working for is the current maintainer of dash.js The paper you have mentioned I know very well, interesting work. However, this approach fails in throughput estimation in my simple code example above since all chunks are equal. This is the case when an encoder produces chunks at equidistant times, which is very likely in ULL in my opinion.
but in fact they mean fetch api/streams api, as HTTP 1.1 standard specifies the the size of the chunk to be sent [1]. So why not fixing this Stream spec issue to allow for simple and exact measurement in future? An event announcing the start of chunk transmission will reduce the problem back again to the simple formula transferred_bits/transmission_duration. Interestingly, in Node.js this event already exists since a while, it is called Here an Node.js example consumer, that allows for exact throughput measurement without the need for any sophisticated calculation and predictions (make sure sender example runs from above example #1126 (comment)): const http = require('http');
let timeMark = Date.now();
let chunkCount = 0;
http.get('http://localhost:3000/data', (res) => {
// https://nodejs.org/dist/latest-v14.x/docs/api/stream.html#stream_event_readable
res.on('readable', () => {
console.log(`readable`);
timeMark = Date.now();
res.read();
});
res.on('data', (chunk) => {
console.log(`got ${++chunkCount}. chunk with ${chunk.length} bytes, in ${Date.now() - timeMark} ms`);
});
res.on('end', () => {
console.log('got all chunks');
});
}).on('error', (e) => {
console.error(`Got error: ${e.message}`);
}); The result is what we expect and desire:
Making the [1] https://datatracker.ietf.org/doc/html/rfc7230#section-4.1 |
I think you misunderstood my point, which is that it is not possible at all. No API change can make it possible. The information is simply not available to the client.
This Node.js code doesn't measure anything meaningful, and certainly not the speed of the network. It's basically just a benchmark of how fast Node can emit the "data" event. |
After some hours of reading the specs it looks like there is no way to get an indication for the point in time when the first byte of a transferred chunk was added to internal buffer of a stream reader.
The use case of this is for example the measurement of throughput in bursty chunked-transfer with idle times between the HTTP chunks when using
fetch
api.Example from the Spec how it IS:
Example how it could be (or in some other comparable way) to enable above use case:
Or is there some other way (preferably supported by browsers already) to achieve this desired measurement with
fetch
?The text was updated successfully, but these errors were encountered: