Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GCS repo plugin update-able secure settings #30688

Conversation

albertzaharovits
Copy link
Contributor

GoogleCloudStoragePlugin implements ReInitializablePlugin. Such plugins implement the reinit handler which receives a Settings object with the inner SecureSettings opened. The plugin has to recreate the clients and the code should not internally cache client references. Instead, client references are served from a cache, where they are built on demand using the latest settings.

This is the sibling of: #29319 #29134 and #28517

Overall progress is tracked by: #29135

@albertzaharovits albertzaharovits added >feature :Distributed/Snapshot/Restore Anything directly related to the `_snapshot/*` APIs v7.0.0 v6.4.0 labels May 17, 2018
@albertzaharovits albertzaharovits self-assigned this May 17, 2018
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-distributed

Copy link
Member

@tlrx tlrx left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @albertzaharovits, it looks easier than I though. I left some comments and suggestions.

}
}

private <R, E extends IOException> R storageAccess(CheckedFunction<Storage, R, E> storageFunction) throws IOException {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I understand why you did that but I find that the two storageAccess() just adds extra unnecessary noise. I think we could instead have a simple private safeClient() method that returns the Storage client to use, and later do things like:

SocketAccess.doPrivilegedIOException(() -> safeClient().get(bucketName));

I find this easier to read and to understand where the stack is cut for access control.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tlrx I kinda think the extra noise is worth the trouble.

For one thing, trying to locally save the reference (that you get as an argument to the lambda) is an obvious code smell. We convey that you only borrow the reference as an argument to the lambda, you don't get to store it locally. In the S3 repo plugin the simple approach was vetoed, but there the client was closeable, so the IDE hinted a resource leak.
Secondly, using the noisy approach we are cutting the access control stack after the client has been built. That is because, if the client is not cached it has to be created, which generally speaking, does not require elevated privileges. In the simple approach the elevated privilege encompass the client creation too. This is however only a theoretical benefit, since currently we sadly elevate privileges when creating the client (to allow for the google default application credentials).

These were my thoughts, but sometimes I tend to complicate things.
Do you wish to sudo?

Copy link
Member

@tlrx tlrx May 23, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't want to sudo, I just want to express my concerns about this. GoogleCloudStorageService provides Storage clients to the plugin but they have to be used in a specific way and not be stored locally, and we try to circumvey this fact by are adding storageAccess() methods like this in the BlobStore implementation. I'm tempted to think that we should not expose Storage clients at all and just have GoogleCloudStorageService (or something else) that borrows a temporary client to execute the needed operation and just returns the result. So that we just keep Storage internal. WDYT?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok. So we could pack the GoogleCloudStorageService together with the clientName in another entity, and this would be responsible for handling the operations without exposing the Storage. Yet, I dislike this because I feel like this encapsulation of the client is already done at the BlobStore level and away from the BlobContainer; BlobContainer implements blob operations, abstracting away the client. This approach, if I got it right, will nest the client even further under and possibly stripping BlobStore or BlobContainer of their duties. In this matter, I think AzureBlobContainer is illustrative; AzureBlobStore and AzureBlobContainer could be collapsed into a single class. This is in part because the azure client is relatively high level and the encapsulation we are talking about is handled internally.

Writing this down, I am gravitating towards your suggestion. There still has to be someone that knows how to use the client (not to cache it locally) and adding another layer (entity or method receiving lambda) is without justification, BlobStore should handle it.

Unless you have other thoughts I will go ahead and have a simple private method inside the BlobStore that returns the Storage instance. Code inside the Blobstore is responsible for not caching the instance.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is history around the azure plugin so I won't take this as an example.

Unless you have other thoughts I will go ahead and have a simple private method inside the BlobStore that returns the Storage instance. Code inside the Blobstore is responsible for not caching the instance.

👍

Let's do that and not block this PR. This is something we can still revisit later on.

@@ -95,7 +120,7 @@ public void close() {
*/
boolean doesBucketExist(String bucketName) {
try {
final Bucket bucket = SocketAccess.doPrivilegedIOException(() -> storage.get(bucketName));
final Bucket bucket = storageAccess(storage -> storage.get(bucketName));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Following my previous comment, this could be:

SocketAccess.doPrivilegedIOException(() -> safeClient().get(bucketName));

@@ -277,13 +300,14 @@ void deleteBlobs(Collection<String> blobNames) throws IOException {
deleteBlob(blobNames.iterator().next());
return;
}
final List<BlobId> blobIdsToDelete = blobNames.stream().map(blobName -> BlobId.of(bucket, blobName)).collect(Collectors.toList());
final List<Boolean> deletedStatuses = SocketAccess.doPrivilegedIOException(() -> storage.delete(blobIdsToDelete));
final List<BlobId> blobIdsToDelete = blobNames.stream().map(blobName -> BlobId.of(bucketName, blobName)).collect(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: can you make this fit on the same line as before?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

will do. For the record this was not my IDE going wild, this was prompted by the 'bucket' -> 'bucketName' rename.

@@ -299,20 +323,18 @@ void deleteBlobs(Collection<String> blobNames) throws IOException {
* @param targetBlob new name of the blob in the same bucket
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: Can you fix the Javadoc?

}
});
// There's no atomic "move" in GCS so we need to copy and delete
storageAccessConsumer(storage -> storage.copy(request).getResult());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is where a safeClient() would be helpful, so that you have less chance that the underlying storage instance changed between the copy and delete calls

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since google forces us to use two calls for the move op they should not be surprised if we do it over different clients...
Anyhow, using the present approach we could extend the lambda to:

storageAccess(storage -> {
  storage.copy(request).getResult();
  return storage.delete(sourceBlobId);
});

and keep it over a single client

this.storageService = createStorageService(settings);
// eagerly load client settings so that secure settings are readable (not closed)
final Map<String, GoogleCloudStorageClientSettings> clientsSettings = GoogleCloudStorageClientSettings.load(settings);
this.storageService.updateClientsSettings(clientsSettings);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we can fold this into the GoogleCloudStorageService class so that we only have

this.storageService = createStorageService(settings);

here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this an acceptable simplification?

-        final Map<String, GoogleCloudStorageClientSettings> clientsSettings = GoogleCloudStorageClientSettings.load(settings);
-        this.storageService.updateClientsSettings(clientsSettings);
+        reinit(settings);
     }

I really wish to avoid passing around Settings objects which should contain open SecureSettings members. GoogleCloudStorageClientSettings is the conveyor of secure settings and GoogleCloudStorageClientSettings#load is only called in the Plugin implementation.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok

// secure settings should be readable
final Map<String, GoogleCloudStorageClientSettings> clientsSettings = GoogleCloudStorageClientSettings.load(settings);
this.storageService.updateClientsSettings(clientsSettings);
return true;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see any documentation for the reinit() method and what it is supposed to return :(

Do you think we could simplify this too and have something like:

return storageService.refreshAndClearCache(settings); ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see any documentation for the reinit() method and what it is supposed to return :(

Well... reinit was vague from the start. I will document/rename/remove return type in a follow up PR, before merging to master, for all plugins. I will surely add you as a reviewer.

The same goes with storageService#updateClientSettings the name deserves better attention, but it should receive clientsSettings not the raw settings assuming the SecureSettings are open.

So, the renamings I will address in a follow-up PR, for all plugins (EC2, S3, Azure, GCS).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, thanks.

* @param clientsSettings the new settings used for building clients for subsequent requests
* @return previous settings which have been substituted
*/
public synchronized Map<String, GoogleCloudStorageClientSettings>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could have a refreshAndClearCache(final Settings settings) method or with a similar name that reflects the fact that settings are refresh and cached clients deleted.

We could also use GoogleCloudStorageClientSettings.load() here that already returns an immutable map.

public synchronized refreshAndClearCache(final Settings settings) {
        final Map<String, GoogleCloudStorageClientSettings> prevSettings = this.clientsSettings;
        this.clientsSettings = GoogleCloudStorageClientSettings.load(settings);
        ...
    }

What do you think? Unless not all the settings are passed when the plugin is reinitialized?

Copy link
Contributor Author

@albertzaharovits albertzaharovits May 21, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could have a refreshAndClearCache(final Settings settings) method or with a similar name that reflects the fact that settings are refresh and cached clients deleted.

Agreed.

Again, I prefer to read secure settings in a GoogleCloudStorageClientSettings instance asap and pass that around, not the raw Settings object.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense, thanks. Do you think it could deserve a comment?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have added comments inside GoogleCloudStoragePlugin#reinit.

        // Secure settings should be readable inside this method. Duplicate client
        // settings in a format (`GoogleCloudStorageClientSettings`) that does not
        // require for the `SecureSettings` to be open. Pass that around (the
        // `GoogleCloudStorageClientSettings` instance) instead of the `Settings`
        // instance.

return storage;
}
storage = SocketAccess.doPrivilegedIOException(() -> createClient(clientName));
clientsCache = MapBuilder.newMapBuilder(clientsCache).put(clientName, storage).immutableMap();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not using HashMap() directly?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or maybe use a ConcurrentHashMap to simplify the code? You could use computeIfAbsent and the like?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not using HashMap() directly?

I suppose MapBuilder was created for the chaining syntax, plus making sure the result is immutable. It grew on me, don't ruin it for me now... 😛

@jaymode
I think ConcurrentHashMap is a behemoth we don't need.
In the common case the map contains a single entry and there is really no contention here, the "other" thread only clears the map. The volatile immutable map is the evolution to the volatile client reference.
If there is an occasion where a volatile reference to an immutable map trumps the concurrent map I think this must be it.
At some point (when I wished the reinit be blocking until all clients have been changed), I also pushed the concurrent map #28517 (comment)).

Do you wish to sudo?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also pushed the concurrent map #28517 (comment))

My quick glance makes this seem like it is somewhat different as there was a read write lock external to that CHM?

The volatile immutable map is fine; this was just a suggestion to simplify the code.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My quick glance makes this seem like it is somewhat different as there was a read write lock external to that CHM?

Yes, unfortunately the comment tied the read write lock to the CHM; I remember they were not tied though... my bad for bringing this up.

I have been thinking and I reckon this synchronized block is necessary

        synchronized (this) {
            storage = clientsCache.get(clientName);
            if (storage != null) {
                return storage;
            }
            storage = SocketAccess.doPrivilegedIOException(() -> createClient(clientName));
            clientsCache = MapBuilder.newMapBuilder(clientsCache).put(clientName, storage).immutableMap();
            return storage;
        }

because we need to keep the settings and the client references updates paired. For example, we might end up with one thread reading the settings (clientsCache.get(clientName)), another changing them, and then the first client will use stale settings to build a client. Now the settings don't reflect the client.
Not saying CHM will never fit, but it will not simplify this part here.

@albertzaharovits
Copy link
Contributor Author

@tlrx and @jaymode this is now ready for your attention again.

Copy link
Member

@tlrx tlrx left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, left a minor comment

try {
return storageService.client(clientName);
} catch (final Exception e) {
throw new IOException(e);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: maybe add a message to the IOException?

Copy link
Member

@jaymode jaymode left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM too

*
* @param blobName name of the blob
* @return an InputStream
* @return the InputStream used to read blob's content
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: s/read blob's/read the blob's

@albertzaharovits albertzaharovits merged commit 000c585 into elastic:reload-secure-store-action May 28, 2018
@albertzaharovits albertzaharovits deleted the update-secure-settings-gcs branch May 28, 2018 09:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Distributed/Snapshot/Restore Anything directly related to the `_snapshot/*` APIs >feature
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants