Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

To zip or not to zip #2107

Closed
mercecrosas opened this issue Apr 28, 2015 · 8 comments
Closed

To zip or not to zip #2107

mercecrosas opened this issue Apr 28, 2015 · 8 comments

Comments

@mercecrosas
Copy link
Member

This is the question - some datasets would benefit for keeping zip file (or a similar packaged or compressed file) for all the files associated with a dataset. This is the case, for example, of a thousands of image files that are generated by an instrument, which need to be viewed in a 3D viewer all together, but individually they don't have much meaning. Perhaps in this case, we can generate a details manifest file about what's in the file (as part of the metadata for that file, for example).

If we allow this, we also need to make clear that in many cases unzipping the file is a better option - for preservation, more detailed information to users and take advantage of additional features associated with individual files (TwoRavens, WorldMap, etc)

@mercecrosas mercecrosas added this to the In Review - Short Term milestone Apr 28, 2015
@eaquigley
Copy link
Contributor

Some initials thoughts on this:

If the case is that most often users should be having their zips unzipped, then do we want it to be a setting one selects when uploading their files or would it make more sense for this to be a setting at the dataverse level? I could see it being very annoying to ask a user each time if they want the zipped file to be unzipped or stayed zipped.

@posixeleni
Copy link
Contributor

We have another use case where a user has hierarchical file structuring in the zip file which will be flattened when the zip file is unpacked so they want to preserve this.

So how about another workaround which is similar to what we do for STATA and SPSS files that we just store the original zip version of the file but also unpack it (so the user doesnt even have to think and it happens automatically)? Of course this workflow would make more sense if there are not thousands of files that need to be unpacked.

@babrahamse
Copy link

I am a relatively new user of the Dataverse (I'm handling MIT's data uploading and metadata) and I would just like to reiterate the request to make the auto-unzipping feature optional. We have several datasets that are already in ZIP format and we want to keep them that way. Having to double-zip everything, while a functional kludge, really adds a lot of time to the uploading operation.

@mheppler
Copy link
Contributor

mheppler commented Sep 1, 2016

Related to #2249 #3247

@amberleahey
Copy link

Hi folks! I know this post is old but I wanted to chime in and ask if there are plans to add the option to upload a .zip and NOT unpack? This would allow authors to choose to upload and retain zip if they wanted, maybe it would be default, but at least present the option. I think the inability to do so at the moment means users are double zipping and creating tar zip packages to get around it. We noticed this in our instance anyway. Any thoughts on reviving this convo?

@djbrooke
Copy link
Contributor

Hey @amberleahey - we'll address a lot of the need for the double zip workaround by providing better support for maintaining (implemented in 4.11) and editing (to be implemented in #5565) file hierarchies.

For those cases where the need for keeping something zipped is not based on structure, I don't currently have a good answer. There's some potential in package files (implemented as part of large file support), but we haven't worked much in that area since it was originally implemented. Thoughts welcome!

@pdurbin
Copy link
Member

pdurbin commented Feb 27, 2019

the option to upload a .zip and NOT unpack

@amberleahey #3439 is what you want, I think. There's even a recent (closed and unmerged) pull request at #5396 if you have a developer who wants to take a look.

@djbrooke
Copy link
Contributor

Closing this in favor of #3439.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

9 participants