Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Remote Store] Support to emit multiple streams for a file content each responsible for processing a specific part of the file #7000

Merged

Conversation

raghuvanshraj
Copy link
Contributor

@raghuvanshraj raghuvanshraj commented Apr 5, 2023

Credits: @vikasvb90 and @itiyamas for the design and core implementations of the feature.

Description

  • Offset based InputStream extensions to emit stream from a specific part of a file, which starts reading from a specific position and ensures that maximum length of content read doesn't exceed a specified limit. OffsetRangeFileInputStream achieves this for File objects, while OffsetRangeIndexInputStream achieves this for lucene's IndexInput construct
  • RemoteTransferContainer has utilities to open streams to specific parts of the file based on the type as mentioned in the previous point. It also manages post upload tasks by implementing an UploadFinalizer.
  • ResettableCheckedInputStream allows for individual parts to be reset through mark and reset in the event of upload failures

Issues Resolved

Check List

  • New functionality includes testing.
    • All tests pass
  • New functionality has been documented.
    • New functionality has javadoc added
  • Commits are signed per the DCO using --signoff
  • Commit changes are listed out in CHANGELOG.md file (See: Changelog)

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

@github-actions
Copy link
Contributor

github-actions bot commented Apr 5, 2023

Gradle Check (Jenkins) Run Completed with:

@raghuvanshraj raghuvanshraj marked this pull request as ready for review April 5, 2023 11:07
@raghuvanshraj raghuvanshraj changed the title Support to emit multiple streams for a file content each responsible for processing a specific part of the file [Remote Store] Support to emit multiple streams for a file content each responsible for processing a specific part of the file Apr 5, 2023
@raghuvanshraj
Copy link
Contributor Author

Tagging @elfisher @muralikpbhat @reta @mch2 @dreamer-89 @andrross @Bukhtawar @sachinpkale @itiyamas @dblock @shwetathareja @saratvemulapalli @ashking94 for review. Please tag others who can review this as well.

@raghuvanshraj raghuvanshraj force-pushed the multi-part-upload-core-1-1 branch 2 times, most recently from 25951f9 to 68fb8d7 Compare April 6, 2023 07:41
@github-actions
Copy link
Contributor

github-actions bot commented Apr 6, 2023

Gradle Check (Jenkins) Run Completed with:

@github-actions
Copy link
Contributor

github-actions bot commented Apr 6, 2023

Gradle Check (Jenkins) Run Completed with:

  • RESULT: UNSTABLE ❕
  • TEST FAILURES:
      1 org.opensearch.snapshots.DedicatedClusterSnapshotRestoreIT.testIndexDeletionDuringSnapshotCreationInQueue

server/build.gradle Outdated Show resolved Hide resolved
Copy link
Collaborator

@Bukhtawar Bukhtawar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some containers look unwieldy and due to missing usage(in some case tests like WriteContext) doesn't give a good understanding on how this is originally designed to be consumed

@github-actions
Copy link
Contributor

github-actions bot commented Jun 7, 2023

Gradle Check (Jenkins) Run Completed with:

…for processing a specific part of the file

Signed-off-by: Raghuvansh Raj <raghraaj@amazon.com>
Signed-off-by: Raghuvansh Raj <raghraaj@amazon.com>
Signed-off-by: Raghuvansh Raj <raghraaj@amazon.com>
Signed-off-by: Raghuvansh Raj <raghraaj@amazon.com>
Signed-off-by: Raghuvansh Raj <raghraaj@amazon.com>
Signed-off-by: Raghuvansh Raj <raghraaj@amazon.com>
@github-actions
Copy link
Contributor

github-actions bot commented Jun 7, 2023

Gradle Check (Jenkins) Run Completed with:

  • RESULT: UNSTABLE ❕
  • TEST FAILURES:
      1 org.opensearch.remotestore.RemoteStoreRefreshListenerIT.testRemoteRefreshRetryOnFailure
      1 org.opensearch.action.admin.cluster.node.tasks.ResourceAwareTasksTests.testBasicTaskResourceTracking

@github-actions
Copy link
Contributor

github-actions bot commented Jun 8, 2023

Gradle Check (Jenkins) Run Completed with:

Signed-off-by: Raghuvansh Raj <raghraaj@amazon.com>
@github-actions
Copy link
Contributor

github-actions bot commented Jun 8, 2023

Gradle Check (Jenkins) Run Completed with:

@Bukhtawar Bukhtawar merged commit 0c1a29a into opensearch-project:main Jun 9, 2023
@gbbafna gbbafna added the backport 2.x Backport to 2.x branch label Jun 9, 2023
opensearch-trigger-bot bot pushed a commit that referenced this pull request Jun 9, 2023
…ch responsible for processing a specific part of the file (#7000)

* Support to emit multiple streams for a file content each responsible for processing a specific part of the file

Signed-off-by: Raghuvansh Raj <raghraaj@amazon.com>
(cherry picked from commit 0c1a29a)
Signed-off-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
kotwanikunal pushed a commit that referenced this pull request Jun 12, 2023
…ch responsible for processing a specific part of the file (#7000) (#7983)

* Support to emit multiple streams for a file content each responsible for processing a specific part of the file


(cherry picked from commit 0c1a29a)

Signed-off-by: Raghuvansh Raj <raghraaj@amazon.com>
Signed-off-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
gaiksaya pushed a commit to gaiksaya/OpenSearch that referenced this pull request Jun 26, 2023
…ch responsible for processing a specific part of the file (opensearch-project#7000) (opensearch-project#7983)

* Support to emit multiple streams for a file content each responsible for processing a specific part of the file


(cherry picked from commit 0c1a29a)

Signed-off-by: Raghuvansh Raj <raghraaj@amazon.com>
Signed-off-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
imRishN pushed a commit to imRishN/OpenSearch that referenced this pull request Jun 27, 2023
…ch responsible for processing a specific part of the file (opensearch-project#7000)

* Support to emit multiple streams for a file content each responsible for processing a specific part of the file

Signed-off-by: Raghuvansh Raj <raghraaj@amazon.com>
Signed-off-by: Rishab Nahata <rnnahata@amazon.com>
shiv0408 pushed a commit to Gaurav614/OpenSearch that referenced this pull request Apr 25, 2024
…ch responsible for processing a specific part of the file (opensearch-project#7000)

* Support to emit multiple streams for a file content each responsible for processing a specific part of the file

Signed-off-by: Raghuvansh Raj <raghraaj@amazon.com>
Signed-off-by: Shivansh Arora <hishiv@amazon.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport 2.x Backport to 2.x branch skip-changelog
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants