-
Notifications
You must be signed in to change notification settings - Fork 237
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Move implicit directory repair from list to delete operations #156
Conversation
Thanks for your pull request. It looks like this may be your first contribution to a Google open source project (if not, look below for help). Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA). 📝 Please visit https://cla.developers.google.com/ to sign. Once you've signed (or fixed any issues), please reply here (e.g. What to do if you already signed the CLAIndividual signers
Corporate signers
ℹ️ Googlers: Go here for more info. |
Great contribution, thank you! |
…Status` methods Note: this is essentially the same change as in [] that triggered omg/12873 in the past, but it has feature flag that turns off it by default and tests that assert number of GCS requests when parallelism is enabled. In the worst case `getFileStatus` method can make up to 3 sequential requests to GCS to get implicit directory status. After moving implicit directory repair from list to delete/rename operations this worst case could be more frequent than before, because there higher chance to encounter implicit non-repaired directory: #156 This CL adds an option to execute these GCS requests in parallel which could reduce latency by up to 3 times. Change on 2019/05/13 by idv <idv@google.com> ------------- Created by MOE: https://github.com/google/moe MOE_MIGRATED_REVID=248044068
…Status` methods Note: this is essentially the same change as in [] that triggered omg/12873 in the past, but it has feature flag that turns off it by default and tests that assert number of GCS requests when parallelism is enabled. In the worst case `getFileStatus` method can make up to 3 sequential requests to GCS to get implicit directory status. After moving implicit directory repair from list to delete/rename operations this worst case could be more frequent than before, because there higher chance to encounter implicit non-repaired directory: #156 This CL adds an option to execute these GCS requests in parallel which could reduce latency by up to 3 times. Change on 2019/05/13 by idv <idv@google.com> ------------- Created by MOE: https://github.com/google/moe MOE_MIGRATED_REVID=248044068
…Status` methods Note: this is essentially the same change as in [] that triggered omg/12873 in the past, but it has feature flag that turns off it by default and tests that assert number of GCS requests when parallelism is enabled. In the worst case `getFileStatus` method can make up to 3 sequential requests to GCS to get implicit directory status. After moving implicit directory repair from list to delete/rename operations this worst case could be more frequent than before, because there higher chance to encounter implicit non-repaired directory: #156 This CL adds an option to execute these GCS requests in parallel which could reduce latency by up to 3 times. Change on 2019/05/13 by idv <idv@google.com> ------------- Created by MOE: https://github.com/google/moe MOE_MIGRATED_REVID=248044068
…Status` methods Note: this is essentially the same change as in [] that triggered omg/12873 in the past, but it has feature flag that turns off it by default and tests that assert number of GCS requests when parallelism is enabled. In the worst case `getFileStatus` method can make up to 3 sequential requests to GCS to get implicit directory status. After moving implicit directory repair from list to delete/rename operations this worst case could be more frequent than before, because there higher chance to encounter implicit non-repaired directory: GoogleCloudDataproc#156 This CL adds an option to execute these GCS requests in parallel which could reduce latency by up to 3 times. Change on 2019/05/13 by idv <idv@google.com> ------------- Created by MOE: https://github.com/google/moe MOE_MIGRATED_REVID=248044068
For details please refer to issue #155.
Automated tests for Hadoop 2&3, and the integration tests have passed cleanly.