-
Notifications
You must be signed in to change notification settings - Fork 369
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
How to upload a file in chunks? #1192
Comments
Then I thought that maybe (async () => {
const googleStorageBucket = googleStorage.bucket('contrawork');
const file = googleStorageBucket.file('images/test.png');
const uri = (await file.createResumableUpload())[0];
await googleStorageBucket.upload('./xaa', {
gzip: false,
offset: 0,
predefinedAcl: 'publicRead',
resumable: true,
uri,
validation: false,
});
const temporaryFileName = tmpNameSync();
fs.writeFileSync(temporaryFileName, Buffer.concat([
fs.readFileSync('./xaa'),
fs.readFileSync('./xab'),
]));
await googleStorageBucket.upload(temporaryFileName, {
gzip: false,
offset: 500000,
predefinedAcl: 'publicRead',
resumable: true,
uri,
validation: false,
});
})(); But that just uploads exactly the same file. |
There is no error or anything. It seems that PUT requests are simply ignored. The first request/respone:
{
"kind": "storage#object",
"id": "contrawork/images/test.png/1589408553615624",
"selfLink": "https://www.googleapis.com/storage/v1/b/contrawork/o/images%2Ftest.png",
"mediaLink": "https://storage.googleapis.com/download/storage/v1/b/contrawork/o/images%2Ftest.png?generation=1589408553615624&alt=media",
"name": "images/test.png",
"bucket": "contrawork",
"generation": "1589408553615624",
"metageneration": "1",
"storageClass": "NEARLINE",
"size": "500000",
"md5Hash": "qCGVMqig4yOiO2yoeRnvNg==",
"crc32c": "KQhJug==",
"etag": "CIiC96HwsekCEAE=",
"timeCreated": "2020-05-13T22:22:33.615Z",
"updated": "2020-05-13T22:22:33.615Z",
"timeStorageClassUpdated": "2020-05-13T22:22:33.615Z"
}
The second:
{
"kind": "storage#object",
"id": "contrawork/images/test.png/1589408553615624",
"selfLink": "https://www.googleapis.com/storage/v1/b/contrawork/o/images%2Ftest.png",
"mediaLink": "https://storage.googleapis.com/download/storage/v1/b/contrawork/o/images%2Ftest.png?generation=1589408553615624&alt=media",
"name": "images/test.png",
"bucket": "contrawork",
"generation": "1589408553615624",
"metageneration": "1",
"storageClass": "NEARLINE",
"size": "500000",
"md5Hash": "qCGVMqig4yOiO2yoeRnvNg==",
"crc32c": "KQhJug==",
"etag": "CIiC96HwsekCEAE=",
"timeCreated": "2020-05-13T22:22:33.615Z",
"updated": "2020-05-13T22:22:33.615Z",
"timeStorageClassUpdated": "2020-05-13T22:22:33.615Z"
}
|
Thanks for the detailed write up. As it stands, we aren't currently able to handle this. There are a few potential solutions I can think of: My thoughts:
@frankyn is there a better way to handle this scenario? Also, please correct any mistakes I may have made in my breakdown above. |
The main issue with using resumable upload in this scenario is that they're sequential, meaning you can't set the offset to where you'd like it to be in different servers. Your best bet will be to use @stephenplusplus's first suggestion using
The nice part of this design is that your chunks can all be upload to GCS from different services without having to deal with ordering in terms of offsets. You'd need to order them more likely by the name of the object. GCS will handle concatenating the objects in atomic operations and provide the resulting object. |
Doesn't this incur a significant additional cost? |
This would be easily possible if the arbitrary (?) 262144 bytes minimum chunk restriction did not exist. With that restriction, it is impossible to develop this functionality. The use case is a distributed file upload system. |
Because there's nothing we can do from our library, I'm going to close the issue. If anyone would like to chime in with more information or ideas, please feel free. |
A lot of stuff has been going on and got behind on these issues. @stephenplusplus, is it possible to make chunk size modifiable? IIUC, 256 KB multiple is a perf guidance but can be smaller. If @gajus is willing to implement the necessary distributed file system without a compose it could be helpful in this case. |
For a bit of context, this is what I was developing. https://github.com/gajus/express-tus The limitation of minimum per-chunk upload size made it impossible to use with Google Storage in a distributed system. The only way to make it work is by first uploading the file to our local storage and then upload it to Google Storage, which is far from perfect (because now user needs to wait ~2x upload time). |
Sorry @frankyn, I missed this tag. The 256kb isn't our limit from this library, but from the upstream API. |
Hi @danielduhh, |
I am loosing my mind here, since I think I've tried absolutely everything and I still cannot figure out what is the proper way to do it.
Suppose I have a 0.9MB file.
I then split that file into 2 chunks:
Assuming I only have access to the resulting chunks (and they might be on different servers, i.e. must be two distinct operations), what is the correct way to upload test.png?
Here is what I tried:
However, this uploads only first image. The second file is never appended.
images_test.png.zip
The text was updated successfully, but these errors were encountered: