Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Delegate toXContent logic from ClusterState to its member classes #48218 #52743

Merged
merged 23 commits into from
Mar 25, 2020

Conversation

zacharymorn
Copy link
Contributor

This is a continuation of PR #48924, but to approach the problem in the recommended direction, as that PR has too many commits to cleanly show the changes.

MetaData.toXContent is already parameterized as it checks for API context for certain serializations. Hence I’ve organized the 4 commits in that PR as follows:

  1. Add API and Gateway serialization tests for complex ClusterState and MetaData objects without modifying production code
  2. Copy over ClusterState.toXContent serialization test metadata section to MetaData.toXContent API serialization tests to highlight the difference and setup baseline
  3. Parameterize MetaData.toXContent further and have its serialization under API context to be as close as possible to ClusterState.
    1. Note that as the updates to existing serialization test cases show, there are inconsistencies of serializations under API context.
    2. We can pick and choose which serializaton scheme we actually want, and which one should no longer be supported under API context.
  4. Replace ClusterState.toXContent metadata section with parameterized MetaData.toXContent

… Gateway context.

Note that as the updates to existing serialization test cases show, there are inconsistencies of serializations under API context.
We can pick and choose which serializaton scheme we actually want, and which one should no longer be supported under API context.

Map<String, String> mapParams = new HashMap<>();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you add this as a static field to this class and call it API_PARAMS?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure will do.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

Copy link
Contributor

@ywelsch ywelsch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for working on this. It's heading in the right direction. I've left some comments (in particular, I'm surprised that we need to change deserialization code).

}
else {
indexMetaData.getSettings()
.toXContent(builder, new MapParams(Collections.singletonMap("flat_settings", params.param("flat_settings"))));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we just pass on params instead of just extracting flat_settings?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As before the rest of params was not passed along to IndexMetaData.toXContent, I suspect this may change the serialization and thus introduce a breaking change. I can try and submit another commit with that to see what happen?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually the original ClusterState#toXContent uses the whole params

. I've updated it.

@@ -246,13 +245,19 @@ public void testSimpleJsonFromAndTo() throws IOException {
assertThat(indexMetaData.mapping().source().string(), equalTo(MAPPING_SOURCE1));
assertThat(indexMetaData.getAliases().size(), equalTo(3));
assertThat(indexMetaData.getAliases().get("alias1").alias(), equalTo("alias1"));
assertThat(indexMetaData.getAliases().get("alias1").filter().string(), equalTo(ALIAS_FILTER1));
// By default, MetaData.Builder.toXContent(metaData) will use ToXContent.EMPTY_PARAMS, which generates the same serialization
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should remove that static method (MetaData.Builder.toXContent) which seems to be only used by this test.

Instead, we should test both API and non-API serialization in this class, and non-API deserialization.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've moved the new serialization tests from MetaDataTests to ToAndFromJsonMetaDataTests.

I've inlined / removed 1 MetaData.Builder#toXContent which is only used by test. There's another one though that takes more parameters and used in production code, so I left that one alone.

@@ -1318,7 +1360,21 @@ public static IndexMetaData fromXContent(XContentParser parser) throws IOExcepti
throw new IllegalArgumentException("Unexpected token: " + token);
}
}
} else if ("warmers".equals(currentFieldName)) {
} else if (KEY_PRIMARY_TERMS.equals(currentFieldName)) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why did you add deserialization code here?

Is it because we are testing API deserialization somewhere? If so, I would rather have the tests changed to test non-API deserialization.

Same comment of other places below.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The primary reason for adding these is, the original ClusterState.toXContent emits array serialization for aliases and object serialization for primary_terms, which are different from what MetaData.toXContent emits, which is object serialization for aliases and array serialization for primary_terms, under API context. The delta can been seen here 072e05d#diff-4385d650a8cef0592db3454ad42e4bf3L1172-R1157

In general, as the duplicated code diverges, there are conflicting serializations under API context.

So the additional deserialization here is to be able to read back in the different serialization emitted by the new code (namely, array serialization for aliases and object serialization for primary_terms)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should avoid being able to read back the API-serialized XContent. There is no requirement to do so, and it introduces leniency in the parsing here, with the possibility of silently hiding bugs where the wrong context is chosen for serialization.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've given this a shot and removed deserialization for both primary_terms and aliases. These ended up breaking a lot of tests as they serialized under default context (API), and deserialized for further checking. I thus removed the breaking tests (as little as I can, but more can be removed as what's left are just setup code) and isolated the removal in this commit 0c11613.

If we don't want to remove these many tests for such small changes, the other possible approach is to not emit those different serialization for aliases and primary_terms for ClusterState.toXContent. This obviously would break code relying on it.

Please let me know which way you prefer, or if you have other ideas around this.

@ywelsch ywelsch self-requested a review March 11, 2020 12:55
@ywelsch ywelsch added :Distributed/Distributed A catch all label for anything in the Distributed Area. If you aren't sure, use this one. >non-issue labels Mar 11, 2020
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-distributed (:Distributed/Distributed)

Copy link
Contributor

@ywelsch ywelsch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've left more comments. Thanks for iterating with me on this.

builder.endArray();
}
builder.endObject();
API_PARAMS.put("flat_settings", params.param("flat_settings"));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This mutates the global variable on every call? Certainly not what we want?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've removed API_PARAMS as suggested so this is no longer relevant.

builder.endObject();
// meta data
if (metrics.contains(Metric.METADATA)) {
metaData.toXContent(builder, new ToXContent.MapParams(API_PARAMS));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of passing down a limited set of parameters here, I wonder if we should pass all of them, but additionally inject the API context.

Given that Metadata is already using API context as default, I wonder if we need to do anything here in this class. Perhaps it's sufficient to just pass down the params as is?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a good point. The delegation refactoring ultimately do away the need for selecting params in this layer. I've removed it.

@@ -1318,7 +1360,21 @@ public static IndexMetaData fromXContent(XContentParser parser) throws IOExcepti
throw new IllegalArgumentException("Unexpected token: " + token);
}
}
} else if ("warmers".equals(currentFieldName)) {
} else if (KEY_PRIMARY_TERMS.equals(currentFieldName)) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should avoid being able to read back the API-serialized XContent. There is no requirement to do so, and it introduces leniency in the parsing here, with the possibility of silently hiding bugs where the wrong context is chosen for serialization.

builder.endObject();
return Strings.toString(builder);
}

public static void toXContent(MetaData metaData, XContentBuilder builder, ToXContent.Params params) throws IOException {
XContentContext context = XContentContext.valueOf(params.param(CONTEXT_MODE_PARAM, "API"));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use CONTEXT_MODE_API constant here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

builder.field("cluster_uuid", metaData.clusterUUID);
// This is required for some serialization under API context, as some tests in MetaDataTests assert its presenc
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure I understand this comment? Is it outdated?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes sorry. This was added in between the commits, as was meant to say that ClusterState.toXContent will now start to emit cluster_uuid_committed in its serialization, as I retried to remove cluster_uuid_committed before to keep ClusterState.toXContent serialization unchanged, but had to put back due to the test asserting it.

@@ -1347,7 +1345,8 @@ public static MetaData fromXContent(XContentParser parser, boolean preserveUnkno
currentFieldName = parser.currentName();
}

if (!"meta-data".equals(parser.currentName())) {
if (!"meta-data".equals(parser.currentName()) &&
!"metadata".equals(parser.currentName())) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As commented above, let's remove the lenient parsing here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. This also caused some test cases to be broken and removed as they relied on deserialization of API-serialized XContent 0c11613

builder.putAlias(AliasMetaData.builder(parser.text()).build());
}
}
else {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

move these elses up (lot's of other occurrences in this file)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've removed these changes as they are also for reading back API-serialized XContent. I did correct else placement in other places though.

@zacharymorn
Copy link
Contributor Author

I've left more comments. Thanks for iterating with me on this.

Sure no problem. I do see these changes to be prone to errors and bugs, and so very appreciate your patience and time for reviewing them carefully!

Once these changes are in good shape, I'll merge in master and resolve conflicts as well.

@ywelsch
Copy link
Contributor

ywelsch commented Mar 24, 2020

@elasticmachine ok to test

@ywelsch ywelsch self-requested a review March 24, 2020 13:49
@ywelsch
Copy link
Contributor

ywelsch commented Mar 24, 2020

@elasticmachine test this please

@@ -1451,7 +1452,13 @@ public static Settings addHumanReadableSettings(Settings settings) {
return builder.build();
}

private static final ToXContent.Params FORMAT_PARAMS = new MapParams(Collections.singletonMap("binary", "true"));
private static final ToXContent.Params FORMAT_PARAMS;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Didn't realize we could use this to save the tests, good to know!

Copy link
Contributor

@ywelsch ywelsch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@ywelsch ywelsch merged commit dc0d35e into elastic:master Mar 25, 2020
ywelsch added a commit to ywelsch/elasticsearch that referenced this pull request Mar 25, 2020
…stic#48218 (elastic#52743)

Delegates the toXContent serialization logic from the ClusterState class to its member classes,
making the code in MetaData, IndexMetaData, ... aware of the serialization context (API vs
GATEWAY). This makes it clearer what the differences are between API and GATEWAY
serialization. Long-term, both should be unified.

Co-authored-by: Yannick Welsch <yannick@welsch.lu>
ywelsch added a commit that referenced this pull request Mar 26, 2020
…4192)

Backport of #52743

Co-authored-by: Yannick Welsch <yannick@welsch.lu>
ywelsch added a commit that referenced this pull request Mar 26, 2020
Realigns master with 7.x
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Distributed/Distributed A catch all label for anything in the Distributed Area. If you aren't sure, use this one. >non-issue v7.8.0 v8.0.0-alpha1
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants