Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add HDFS searchable snapshot integration #66185

Merged
merged 8 commits into from
Dec 14, 2020

Conversation

jbaiera
Copy link
Member

@jbaiera jbaiera commented Dec 10, 2020

Adds a bounded read implementation on the HDFS blob store as well as integration tests to the searchable snapshot project that ensures functionality on both kerberos and simple authentication HDFS.

@jbaiera jbaiera added >feature :Distributed/Snapshot/Restore Anything directly related to the `_snapshot/*` APIs labels Dec 10, 2020
@elasticmachine elasticmachine added the Team:Distributed Meta label for distributed team label Dec 10, 2020
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-distributed (Team:Distributed)

@jbaiera
Copy link
Member Author

jbaiera commented Dec 10, 2020

Currently debugging an intermittent SocketPermission exception but the code should be in a place where it's ok for review.

@jbaiera
Copy link
Member Author

jbaiera commented Dec 10, 2020

@elasticmachine update branch

@jbaiera
Copy link
Member Author

jbaiera commented Dec 10, 2020

Update: I've been repeating the tests that were failing intermittently with SocketPermissions through the afternoon. After pushing 6d86e22 they have not resurfaced. We should be good on that front now!

Copy link
Member

@original-brownbear original-brownbear left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good @jbaiera just one question I think from a quick read through this.

@@ -50,7 +50,7 @@
static {
// We can do FS ops with only a few elevated permissions:
SIMPLE_AUTH_PERMISSIONS = new Permission[]{
new SocketPermission("*", "connect"),
new SocketPermission("*", "connect,resolve"),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need resolve now, it seems the only new thing we do is a .seek call?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

likely not. I added this to try to appease the permission errors. Given the problem was something else, it's unlikely that we'll need this.

Copy link
Member

@jasontedor jasontedor left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Other than the resolve permission question that @original-brownbear has, LGTM.

Copy link
Member

@original-brownbear original-brownbear left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM (sorry for the delay, somehow missed clicking submit here)

@jbaiera jbaiera merged commit 9bb6a3a into elastic:master Dec 14, 2020
@jbaiera jbaiera deleted the hdfs-repository-read-implementation branch December 14, 2020 21:04
jbaiera added a commit to jbaiera/elasticsearch that referenced this pull request Dec 14, 2020
Adds a bounded read implementation on the HDFS blob store as well as integration tests to 
the searchable snapshot project that ensures functionality on both kerberos and simple 
authentication HDFS.
jasontedor added a commit to jasontedor/elasticsearch that referenced this pull request Dec 14, 2020
* elastic/master: (33 commits)
  Add searchable snapshot cache folder to NodeEnvironment (elastic#66297)
  [DOCS] Add dynamic runtime fields to docs (elastic#66194)
  Add HDFS searchable snapshot integration (elastic#66185)
  Support canceling cross-clusters search requests (elastic#66206)
  Mute testCacheSurviveRestart (elastic#66289)
  Fix cat tasks api params in spec and handler (elastic#66272)
  Snapshot of a searchable snapshot should be empty (elastic#66162)
  [ML] DFA _explain API should not fail when none field is included (elastic#66281)
  Add action to decommission legacy monitoring cluster alerts (elastic#64373)
  move rollup_index param out of RollupActionConfig (elastic#66139)
  Improve FieldFetcher retrieval of fields (elastic#66160)
  Remove unsed fields in `RestAnalyzeAction` (elastic#66215)
  Simplify searchable snapshot CacheKey (elastic#66263)
  Autoscaling remove feature flags (elastic#65973)
  Improve searchable snapshot mount time (elastic#66198)
  [ML] Report cause when datafeed extraction encounters error (elastic#66167)
  Remove suggest reference in some API specs (elastic#66180)
  Fix warning when installing a plugin for different ESversion (elastic#66146)
  [ML] make `xpack.ml.max_ml_node_size` and `xpack.ml.use_auto_machine_memory_percent` dynamically settable (elastic#66132)
  [DOCS] Add `require_alias` to Bulk API (elastic#66259)
  ...
jbaiera added a commit that referenced this pull request Dec 14, 2020
Adds a bounded read implementation on the HDFS blob store as well as integration tests to 
the searchable snapshot project that ensures functionality on both kerberos and simple 
authentication HDFS.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Distributed/Snapshot/Restore Anything directly related to the `_snapshot/*` APIs >feature Team:Distributed Meta label for distributed team v7.11.0 v8.0.0-alpha1
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants