Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update AWS SDK to 1.11.340 in repository-s3 #30723

Merged
merged 11 commits into from
Sep 12, 2018
Merged

Conversation

tlrx
Copy link
Member

@tlrx tlrx commented May 18, 2018

Updates the version of the AWS SDK to 1.11.340 (previously 1.11.225) in the repository-s3 plugin.

I was a bit worried about how the removal of the region setting (#22758) would play with the latest versions of the AWS SDK, because in code samples the region is always set when creating the client.

With this pull request, when no endpoint is defined then the client is initialized with the S3 default endpoint
and region( as provided by the AWS SDK) and with the enableForceGlobalBucketAccess option. With this option, the first HTTP request (doesBucketExist()) will be executed against the default S3 endpoint which returns a HTTP redirection response with the bucket's region set in the response headers. This region is then used by the AWS SDK for the other requests. It also keeps an internal cache that contains the resolved regions to use for a given bucket.

When the endpoint is defined in the settings:

  • if this endpoint is an AWS service endpoint then the region to use is determinated by the AWS SDK public AwsHostNameUtils.parseRegion() method (this is the method used internally by the SDK to resolve region when the endpoint is overriden on the AmazonWebServiceClient).
  • if it's a custom endpoint, then the region remains null

The method used to check if the endpoint is an AWS service endpoint is similar to what is done internally by the SDK.

The main goal of this pull request was to check how "easy" it is now to update AWS SDK using integration tests, to expose a new storage class (#30474) and other bug fixes, and also to facilitate user contributions like #25552.

@tlrx tlrx added >enhancement review :Distributed/Snapshot/Restore Anything directly related to the `_snapshot/*` APIs v7.0.0 v6.4.0 labels May 18, 2018
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-distributed

@tlrx
Copy link
Member Author

tlrx commented May 18, 2018

@elasticmachine test this please

@tlrx tlrx requested a review from rjernst May 22, 2018 07:04
@tlrx tlrx changed the title Update AWS SDK to 1.11.331 in repository-s3 Update AWS SDK to 1.11.333 in repository-s3 May 22, 2018
@tlrx
Copy link
Member Author

tlrx commented May 22, 2018

@elasticmachine test this please

@tlrx tlrx mentioned this pull request May 22, 2018
@tlrx tlrx force-pushed the aws-sdk-update branch 2 times, most recently from 0cff666 to ebd60bf Compare May 24, 2018 12:13
@tlrx tlrx changed the title Update AWS SDK to 1.11.333 in repository-s3 Update AWS SDK to 1.11.335 in repository-s3 May 24, 2018
@tlrx tlrx changed the title Update AWS SDK to 1.11.335 in repository-s3 Update AWS SDK to 1.11.337 in repository-s3 May 28, 2018
@tlrx tlrx force-pushed the aws-sdk-update branch 2 times, most recently from cbfbb2b to e00fe76 Compare May 28, 2018 10:03
@tlrx tlrx changed the title Update AWS SDK to 1.11.337 in repository-s3 Update AWS SDK to 1.11.340 in repository-s3 Jun 5, 2018
Copy link
Contributor

@DaveCTurner DaveCTurner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It makes me uncomfortable to have this derive-region-from-endpoint logic in our code. It looks like it works as far as I understand it, but this is not documented behaviour so there's a risk that it isn't right now in some obscure cases, and a larger risk that it becomes incorrect in the future. Amazon's SDKs prefer deriving things like endpoints from the region, not the other way round, and I think we should try and move towards that model.

Additionally, when using other S3-compatible services (e.g. Minio), the region can be set to an arbitrary value, and this is required if requests are to be correctly signed. At the moment it works as long as the Minio administrator leaves the region set to the default value, an empty string, but I think we cannot deal with other values here.

I think the right thing to do is to expose the region parameter again, and possibly also enableForceGlobalBucketAccess, so that users have more direct control over the SDK. We may need this region-from-endpoint logic for BWC reasons.

final String serviceEndpoint = Strings.hasText(clientSettings.endpoint) ? clientSettings.endpoint : Constants.S3_HOSTNAME;

String signingRegion = null;
if (isAwsStandardEndpoint(serviceEndpoint)) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This makes me uncomfortable - see review summary.

@dadoonet
Copy link
Member

It makes me uncomfortable to have this derive-region-from-endpoint logic in our code. It looks like it works as far as I understand it, but this is not documented behaviour so there's a risk that it isn't right now in some obscure cases, and a larger risk that it becomes incorrect in the future. Amazon's SDKs prefer deriving things like endpoints from the region, not the other way round, and I think we should try and move towards that model.

I do agree and brought that sometime ago with #27924 and #27925 (diff)

IMO we should indeed try to follow what is recommended by the platform.

I think the right thing to do is to expose the region parameter again

👍 on my end. But it's important that @rjernst also shares his thoughts.

@rjernst
Copy link
Member

rjernst commented Jun 20, 2018

I have no real opinion on this, as I haven't worked on the repository plugins in a while. I will say that the code was unnecessarily complex and lenient when we supported both region and endpoint. There were no tests, and you could set a mix of wrong region/endpoint that would not fail until runtime (when a repository was created). IIRC, at least one was always necessary, when in fact the defaults for s3 will auto find the correct region from the bucket name (it can do this since s3 buckets have a global namespace across all regions). If both are to be supported, I strongly urge there to be strict checking, and the defaults to still work (ie not having to set anything for 99% of the cases other than bucket name).

@tlrx
Copy link
Member Author

tlrx commented Aug 20, 2018

Thanks a lot for all your comments and I apologize again for the time it takes me to get this in.

According to the SDK documentation and code samples, the expected way to create an AmazonS3 client is to use a AmazonS3ClientBuilder and to configure the region to use. The new version of the SDK (named V2 and not yet released) also expects the region to be set when building a client.

The SDK still supports the possibility to not set the region and in this case it uses the default Amazon S3 endpoint s3.amazonaws.com with the default region us-east-1. When the client hits the default endpoint for the first time it receives in a response header the name of the region to use. The client keeps in a cache the bucket to region associative map.

This is the current behavior of the repository-s3 plugin since the region setting has been removed. The deprecated constructor new AmazonS3Client(credentials, configuration) initializes itself with the default endpoint Constants.S3_HOSTNAME which resolves the the default us-east-1 for signing request.

I updated the pull request so that it does not use the deprecated ctor anymore but still keep the current behavior without relying on a dirty endpoint-to-region logic like I proposed in a previous change. When no endpoint is defined the AmazonS3ClientBuilder is configured with the default S3 endpoint similar to what the AmazonS3Client does in its private init() method. So the SDK uses the same internal logic to resolve the region to use like it would do with the deprecated AmazonS3Client.

On a short term (but outside the scope of this pull request), we will have to reintroduce the region setting. It is not required by the SDK yet but I think it will become the normal way to configure a client soon.

With the AmazonS3ClientBuilder, the region is configured with the withRegion(String)and that's it. And when both endpoint and region are defined we'll have to use withEndpointConfiguration(new EndpointConfiguration(endpoint, region)). It will also work with Minio.

If both are to be supported, I strongly urge there to be strict checking,

I don't think we'll be able to do that if we also support Minio where region/endpoint can be anything.

and the defaults to still work (ie not having to set anything for 99% of the cases other than bucket name).

We should be able to support that as long as the SDK supports it itself.

@ywelsch @DaveCTurner @rjernst Gentlemen, can you please have another look to this?

Copy link
Contributor

@DaveCTurner DaveCTurner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, but I think a comment about why we're setting the endpoint rather than letting the SDK work it out would be useful. See inline comment.


final String endpoint = Strings.hasLength(clientSettings.endpoint) ? clientSettings.endpoint : Constants.S3_HOSTNAME;
logger.debug(() -> new ParameterizedMessage("using endpoint [{}]", endpoint));
builder.withEndpointConfiguration(new AwsClientBuilder.EndpointConfiguration(endpoint, null));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It took me some time to see what was happening here.

If the endpoint configuration isn't set on the builder then the default behaviour is to try and work out what region we are in and use an appropriate endpoint - see AwsClientBuilder#setRegion. In contrast, directly-constructed clients use s3.amazonaws.com unless otherwise instructed. We currently use a directly-constructed client, and need to keep the existing behaviour to avoid a breaking change, so to move to using the builder we must set it explicitly to keep the existing behaviour.

AIUI we do this because directly constructing the client is deprecated (was already deprecated in 1.1.223 too) so this change removes that usage of a deprecated API.

A short comment to this effect would be useful, I think, to save future readers.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I presume, by the way, that passing null for the region here is ok. I haven't tried it.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the suggested comment, I'll steal it from you.

I presume, by the way, that passing null for the region here is ok. I haven't tried it.

Yes. Similarly to the deprecated setEndpoint() method that leaves the region to null until it is later resolved from the endpoint to the default us-east-1 region.

builder.withClientConfiguration(buildConfiguration(clientSettings));

final String endpoint = Strings.hasLength(clientSettings.endpoint) ? clientSettings.endpoint : Constants.S3_HOSTNAME;
logger.debug(() -> new ParameterizedMessage("using endpoint [{}]", endpoint));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: this could just be logger.debug("using endpoint [{}]", endpoint);.

@rjernst
Copy link
Member

rjernst commented Aug 20, 2018

I don't think we'll be able to do that if we also support Minio where region/endpoint can be anything.

Unless we are to have explicit tests for Minio, I don't think we should claim support or relax validation. And even then, if support for minio is desired, then was endpoints can still be validated against the correct region (or better yet, only allow endpoint, or both). For example, when the endpoint is a known s3 endpoint, the region should be verified it matches the endpoint?

Also, a problem we used to have is new regions could not be used until we released an updated version with an updated list in our hardcoded list of regions. I think that having the default behavior still use the default endpoint will still work to avoid this problem, but I want to make sure it is explicitly considered in any changes, as that was very annoying for users since it could be months before we have a new release with added regions (and I don't think we should be updating these regions in bug fix versions).

@lcawl lcawl added v6.4.1 and removed v6.4.0 labels Aug 23, 2018
@tlrx tlrx merged commit 7e195c2 into elastic:master Sep 12, 2018
@tlrx
Copy link
Member Author

tlrx commented Sep 12, 2018

Thanks @DaveCTurner and everybody involves in this pull request.

Thanks @rjernst for sharing your concerns, I agree with you on all points. When we'll reintroduce the region setting we'll make sure to put you into the loop and add the AWS S3 endpoint/region validation you asked for with appropriate tests.

@tlrx tlrx removed the v6.4.1 label Sep 12, 2018
@tlrx tlrx deleted the aws-sdk-update branch September 12, 2018 13:55
jasontedor added a commit to jasontedor/elasticsearch that referenced this pull request Sep 12, 2018
* master:
  [HLRC][ML] Add ML put datafeed API to HLRC (elastic#33603)
  Update AWS SDK to 1.11.406  in repository-s3 (elastic#30723)
jasontedor added a commit to martijnvg/elasticsearch that referenced this pull request Sep 12, 2018
* master: (43 commits)
  [HLRC][ML] Add ML put datafeed API to HLRC (elastic#33603)
  Update AWS SDK to 1.11.406  in repository-s3 (elastic#30723)
  Expose CCR stats to monitoring (elastic#33617)
  [Docs] Update match-query.asciidoc (elastic#33610)
  TEST: Adjust rollback condition when shard is empty
  [CCR] Improve shard follow task's retryable error handling (elastic#33371)
  Forbid negative `weight` in Function Score Query (elastic#33390)
  Clarify context suggestions filtering and boosting (elastic#33601)
  Disable CCR REST endpoints if CCR disabled (elastic#33619)
  Lower version on full cluster restart settings test
  Upgrade remote cluster settings (elastic#33537)
  NETWORKING: http.publish_host Should Contain CNAME (elastic#32806)
  Add test coverage for global checkpoint listeners
  Reset replica engine to global checkpoint on promotion (elastic#33473)
  HLRC: ML Delete Forecast API (elastic#33526)
  Remove debug logging in full cluster restart tests (elastic#33612)
  Expose CCR to the transport client (elastic#33608)
  Mute testIndexDeletionWhenNodeRejoins
  SQL: Make Literal a NamedExpression (elastic#33583)
  [DOCS] Adds missing built-in user information (elastic#33585)
  ...
jasontedor added a commit to martijnvg/elasticsearch that referenced this pull request Sep 12, 2018
* master: (128 commits)
  [HLRC][ML] Add ML put datafeed API to HLRC (elastic#33603)
  Update AWS SDK to 1.11.406  in repository-s3 (elastic#30723)
  Expose CCR stats to monitoring (elastic#33617)
  [Docs] Update match-query.asciidoc (elastic#33610)
  TEST: Adjust rollback condition when shard is empty
  [CCR] Improve shard follow task's retryable error handling (elastic#33371)
  Forbid negative `weight` in Function Score Query (elastic#33390)
  Clarify context suggestions filtering and boosting (elastic#33601)
  Disable CCR REST endpoints if CCR disabled (elastic#33619)
  Lower version on full cluster restart settings test
  Upgrade remote cluster settings (elastic#33537)
  NETWORKING: http.publish_host Should Contain CNAME (elastic#32806)
  Add test coverage for global checkpoint listeners
  Reset replica engine to global checkpoint on promotion (elastic#33473)
  HLRC: ML Delete Forecast API (elastic#33526)
  Remove debug logging in full cluster restart tests (elastic#33612)
  Expose CCR to the transport client (elastic#33608)
  Mute testIndexDeletionWhenNodeRejoins
  SQL: Make Literal a NamedExpression (elastic#33583)
  [DOCS] Adds missing built-in user information (elastic#33585)
  ...
@tlrx tlrx added the v6.5.0 label Sep 12, 2018
DaveCTurner added a commit to DaveCTurner/elasticsearch that referenced this pull request Sep 6, 2019
The `repository-s3` plugin has supported a storage class of `onezone_ia` since
the SDK upgrade in elastic#30723, but we do not test or document this fact. This
commit adds this storage class to the docs and adds a test to ensure that the
documented storage classes are all accepted by S3 too.

Fixes elastic#30474
DaveCTurner added a commit that referenced this pull request Sep 9, 2019
The `repository-s3` plugin has supported a storage class of `onezone_ia` since
the SDK upgrade in #30723, but we do not test or document this fact. This
commit adds this storage class to the docs and adds a test to ensure that the
documented storage classes are all accepted by S3 too.

Fixes #30474
DaveCTurner added a commit that referenced this pull request Sep 9, 2019
The `repository-s3` plugin has supported a storage class of `onezone_ia` since
the SDK upgrade in #30723, but we do not test or document this fact. This
commit adds this storage class to the docs and adds a test to ensure that the
documented storage classes are all accepted by S3 too.

Fixes #30474
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Distributed/Snapshot/Restore Anything directly related to the `_snapshot/*` APIs >enhancement v6.5.0 v7.0.0-beta1
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants