Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GoogleCloudStorageFileSystem#delete recursive does not page #1022

Open
mswintermeyer opened this issue Jun 28, 2023 · 0 comments
Open

GoogleCloudStorageFileSystem#delete recursive does not page #1022

mswintermeyer opened this issue Jun 28, 2023 · 0 comments

Comments

@mswintermeyer
Copy link

GoogleCloudStorageFileSystem#delete assumes that the list of files it is deleting can be stored in memory. Rather than delete one page at a time when deleting a very large directory recursively, it loads them all into a List:

? listFileInfoForPrefix(fileInfo.getPath(), DELETE_RENAME_LIST_OPTIONS)
. It seems there's a listFileInfoForPrefixPage method that it could use instead, and just call that iteratively until all files are deleted.

As a contrast, S3's deletion code uses an iterator to delete directories recursively: https://github.com/apache/hadoop/blob/4bd873b816dbd889f410428d6e618586d4ff1780/hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/impl/DeleteOperation.java#L244-L246.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant