-
Notifications
You must be signed in to change notification settings - Fork 61
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
CSVs take up too much space #498
Comments
After talking with @nanounanue we decided to repurpose this: CSVs should be compressed by default. The current situation makes us want to use HDFs all the time even when we won't gain anything from them. Compressing CSVs would make it work for more use cases and make HDF-S3 support unnecessary |
👍 👍 I'm not positive that compressed CSV will make this 100% unnecessary, (but yeah certainly might!). Regardless, it shouldn't be a ton of work to read and write CSVs with GZIP compression, by default, and I absolutely agree that that should be done. We're just talking about something like:
…Which, with S3Fs for example, just becomes:
(Simply taken from the S3Fs docs.) |
- Make the CSVMatrixStore use compression and rename files to csv.gz
HDF matrix data must be present on the local file system. However, as an experimenter, I might like to be able to specify an S3 file path, with the expectation that this data will be downloaded from and uploaded to S3 as necessary, on my behalf.
The text was updated successfully, but these errors were encountered: