-
Notifications
You must be signed in to change notification settings - Fork 61
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Reduce Memory Usage of Matrix Building [Resolves #372]
Matrix building is still very memory-intensive, for no particularly good reason: we're not using the matrices at that point, just transferring them from database to disk with an in-Python join to get around column limits. While we're still using pandas to build the matrices themselves, this is hard to get around: any type of pandas join will always use multiple times the memory needed. Bringing the memory usage down to what is actually needed for the data is better, but even better is to make the memory usage controllable by never keeping the matrix in memory. Using Ohio's PipeTextIO makes this technically feasible, but to make it work out we also need to remove HDF support. HDF support was added merely for the compression capabilities, and with recent changes to compress CSVs, this is no longer needed. - Remove HDFMatrixStore and hdf support from the experiment and CLI - Modify MatrixStore.save to take in a bytestream instead of assuming it has a dataframe available to convert
- Loading branch information
Showing
14 changed files
with
361 additions
and
543 deletions.
There are no files selected for viewing
Large diffs are not rendered by default.
Oops, something went wrong.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.