Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement Exclude Patterns for Snapshot- and Repository Names in Get Snapshots API #77308

Conversation

original-brownbear
Copy link
Member

It's in the title. Adds support for exclude patterns combined with wildcard requests
similar to what we support for index names.

…Snapshots API

It's in the title. Adds support for exclude patterns combined with wildcard requests
similar to what we support for index names.
@original-brownbear original-brownbear added >enhancement :Distributed/Snapshot/Restore Anything directly related to the `_snapshot/*` APIs v8.0.0 v7.16.0 labels Sep 6, 2021
@elasticmachine elasticmachine added the Team:Distributed Meta label for distributed team label Sep 6, 2021
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-distributed (Team:Distributed)

Copy link
Member

@tlrx tlrx left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM - I left minor comments, feel free to address

"-" + otherPrefixSnapshot1,
"-" + otherPrefixSnapshot2
);
assertThat(allInOtherWithoutOtherExplicit, is(allInOther));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we test the result when a snapshot is both included and excluded?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

++ added explicit check for this case

for (String repositoryOrPattern : repoNames) {
if (Regex.isSimpleMatchPattern(repositoryOrPattern) == false) {
if (seenWildcard && repositoryOrPattern.length() > 1 && repositoryOrPattern.startsWith("-")) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suppose we could iterate 2 times over the repositories, the first time to match repositories against include patterns ad the second to filter out matching repos with exclude patterns? Anyway the current code works too.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we perhaps do this similar to the slm-change, where we first derive the include/exclude patterns and then run through the repos? And then extract the logic to a shared method to use such that we are sure that the include/exclude logic works the same.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I dried up and shortened the logic quite a bit now across repos and snapshots but I couldn't use the exact same code because of the special case handling for stuff like _current and 404 behavior. Still should be more efficient and easier to read now I hope.

@arteam arteam self-requested a review September 7, 2021 08:30
final List<String> namesRepo2 = createNSnapshots(repoName2, randomIntBetween(1, 5));
final List<String> namesOtherRepo = createNSnapshots(otherRepo, randomIntBetween(1, 5));

final Collection<String> allSnapshotNames = new HashSet<>(namesRepo1);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider concatenating repos with the Streams API:

Stream.concat(namesRepo1.stream(), namesRepo2.stream()).collect(Collectors.toSet())

Copy link
Contributor

@henningandersen henningandersen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left a few comments to consider.

for (String repositoryOrPattern : repoNames) {
if (Regex.isSimpleMatchPattern(repositoryOrPattern) == false) {
if (seenWildcard && repositoryOrPattern.length() > 1 && repositoryOrPattern.startsWith("-")) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we perhaps do this similar to the slm-change, where we first derive the include/exclude patterns and then run through the repos? And then extract the logic to a shared method to use such that we are sure that the include/exclude logic works the same.

if (isAllSnapshots(snapshots)) {
toResolve.addAll(allSnapshotIds.values());
} else {
for (String snapshotOrPattern : snapshots) {
if (GetSnapshotsRequest.CURRENT_SNAPSHOT.equalsIgnoreCase(snapshotOrPattern)) {
if (seenWildcard && snapshotOrPattern.length() > 1 && snapshotOrPattern.startsWith("-")) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

similarly to the repo-name matching, can we derive the include/exclude pattern lists first and share most of the logic between the two (and slm-policy)?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

++ makes sense, the logic isn't exactly the same because of the concrete snapshot name -> not found -> throw case but I'll dry it up. I'll wait for the other PR to land to continue here then :)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On second thought, no we can't use the same logic as there's a difference between how the resolution works for policies and repositories/snapshots. For repositories and snapshots, if we have a concrete/non-wildcard pattern in the mix, we throw a 404ish exception if we can't find the request repo/snapshot. We don't do any such thing for SLM policies which never cause exceptions to be thrown (which makes sense IMO). I'll try to simplify the logic a little regardless :)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would think we could extract the include/exclude patterns in one method shared between multiple purposes? And then run through them afterwards handling the exceptions etc.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The thing is, we have different behavior between the repos API (where we loop over the patterns and then nest the repos under them) while we can do the more efficient way of going over all the snapshot names and then matching against all patterns in one go (because we order the repos according to the order of the patterns instead of by their name or so).
It doesn't really seem like there's a simple way of drying this up but then again note that we're going to move the snapshot name logic away from running on heap like this in a follow-up, so I'm not sure there's too much point in abstracting things away only for that abstraction to become redundant in a week?

patternOtherRepo,
GetSnapshotsRequest.SortBy.REPOSITORY,
order,
"-" + otherPrefixSnapshot1,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: it seems odd that for the repo pattern we pass in a comma-separated string, whereas for the names we pass in a list of name patterns. I would prefer to see this harmonized.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

harmonized to using arrays/var-arg for both :)

@@ -208,6 +209,67 @@ public void testPaginationRequiresVerboseListing() throws Exception {
);
}

public void testExcludePatterns() throws Exception {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we add a test validating that using -xyz finds such a snapshot (and repo) if no wildcard is specified first, i.e., at that we can still find snapshots/repos starting with -?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point! -> added a couple of tests around this scenario now.

@original-brownbear
Copy link
Member Author

Thanks Henning, all points addressed now (though I couldn't dry things up as requested I'm afraid).

@original-brownbear
Copy link
Member Author

Jenkins run elasticsearch-ci/packaging-tests-windows-sample

1 similar comment
@original-brownbear
Copy link
Member Author

Jenkins run elasticsearch-ci/packaging-tests-windows-sample

@original-brownbear
Copy link
Member Author

@elasticmachine update branch

Copy link
Contributor

@henningandersen henningandersen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few smaller comments. Still hoping for a bit of code/sharing on include/exclude patterns 🙂

+
To get information about all snapshot repositories registered in the
cluster, omit this parameter or use `*` or `_all`.

`<snapshot>`::
(Required, string)
Comma-separated list of snapshot names to retrieve. Also accepts wildcards (`*`).
Comma-separated list of snapshot names to retrieve. Also accepts wildcards (`*`) as well as combining wildcards and exclude patterns
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we align the text between repo and name match? Otherwise I end up searching for the difference between the text to understand if there is some difference...
Perhaps just state

Wildcard (`*`) expressions are supported including combining wildcards with exclude patterns starting with `-`

like above?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, that was pointless, made it the same for both as suggested now.

if (isAllSnapshots(snapshots)) {
toResolve.addAll(allSnapshotIds.values());
} else {
for (String snapshotOrPattern : snapshots) {
if (GetSnapshotsRequest.CURRENT_SNAPSHOT.equalsIgnoreCase(snapshotOrPattern)) {
if (seenWildcard && snapshotOrPattern.length() > 1 && snapshotOrPattern.startsWith("-")) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would think we could extract the include/exclude patterns in one method shared between multiple purposes? And then run through them afterwards handling the exceptions etc.

if (Regex.isSimpleMatchPattern(repositoryOrPattern) == false) {
repositoriesToGet.add(repositoryOrPattern);
}
final List<String> includePatterns = new ArrayList<>();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the new exclude functionality also affects the list repositories API, we should also update the documentation for that API then.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

++ udpated

)
)
);
}
Copy link
Contributor

@henningandersen henningandersen Sep 9, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we also verify that non-existing*,-weird-snapshot-1 does not return -weird-snapshot-1? And that *,--weird-snapshot-1 also does not return it? And that -* returns all - prefixed snapshots.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

++ added those cases :)

@original-brownbear
Copy link
Member Author

Thanks @henningandersen! addresed all points but sharing the pattern resolution code, to me at least it seems with the way we do the ordering and not-ordering when resolving things, it's going to be pain to share code at this point in time I think, but lets see what you think :)

@original-brownbear
Copy link
Member Author

Alright @henningandersen this should be good for another round. Implemented _current as wildcard as well as fixed the pattern,name,excludePatternForName case we discussed last night and added tests for both :)

Copy link
Contributor

@henningandersen henningandersen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

+
To get information about all snapshot repositories registered in the
cluster, omit this parameter or use `*` or `_all`.

`<snapshot>`::
(Required, string)
Comma-separated list of snapshot names to retrieve. Also accepts wildcards (`*`).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we still want to include this bit "Comma-separated list of snapshot names to retrieve. "?

@original-brownbear
Copy link
Member Author

Thanks everyone!

@original-brownbear original-brownbear merged commit f1a4551 into elastic:master Sep 10, 2021
@original-brownbear original-brownbear deleted the support-negative-snapshot-patterns branch September 10, 2021 13:00
original-brownbear added a commit to original-brownbear/elasticsearch that referenced this pull request Oct 17, 2021
…Snapshots API (elastic#77308)

It's in the title. Adds support for exclude patterns combined with wildcard requests
similar to what we support for index names.
original-brownbear added a commit that referenced this pull request Oct 17, 2021
…Snapshots API (#77308) (#79315)

It's in the title. Adds support for exclude patterns combined with wildcard requests
similar to what we support for index names.
@original-brownbear original-brownbear restored the support-negative-snapshot-patterns branch April 18, 2023 20:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Distributed/Snapshot/Restore Anything directly related to the `_snapshot/*` APIs >enhancement Team:Distributed Meta label for distributed team v7.16.0 v8.0.0-alpha2
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants