ClusterState's toXContent (imperfectly) duplicates logic from other classes' toXContent #48218

gwbrown · 2019-10-17T23:07:39Z

ClusterState.toXContent method serializes IndexMetaData itself, rather than delegating to IndexMetaData.toXContent. This has resulted in some additions to IndexMetaData, such as RolloverInfo, not being included in the serialized cluster state returned by GET _cluster/state. There is also duplicated logic for MetaData.toXContent (which does call `IndexMetaData.toXContent), although it appears that this might be so that the information can be presented in a different form.

The code in ClusterState which serializes IndexMetaData objects is found here:

elasticsearch/server/src/main/java/org/elasticsearch/cluster/ClusterState.java

Lines 459 to 505 in 6abacd2

    
           for (IndexMetaData indexMetaData : metaData()) { 
        
               builder.startObject(indexMetaData.getIndex().getName()); 
        
               builder.field("state", indexMetaData.getState().toString().toLowerCase(Locale.ENGLISH)); 
        
               builder.startObject("settings"); 
        
               Settings settings = indexMetaData.getSettings(); 
        
               settings.toXContent(builder, params); 
        
               builder.endObject(); 
        
               builder.startObject("mappings"); 
        
               for (ObjectObjectCursor<String, MappingMetaData> cursor : indexMetaData.getMappings()) { 
        
                   Map<String, Object> mapping = XContentHelper 
        
                           .convertToMap(new BytesArray(cursor.value.source().uncompressed()), false).v2(); 
        
                   if (mapping.size() == 1 && mapping.containsKey(cursor.key)) { 
        
                       // the type name is the root value, reduce it 
        
                       mapping = (Map<String, Object>) mapping.get(cursor.key); 
        
                   } 
        
                   builder.field(cursor.key); 
        
                   builder.map(mapping); 
        
               } 
        
               builder.endObject(); 
        
               builder.startArray("aliases"); 
        
               for (ObjectCursor<String> cursor : indexMetaData.getAliases().keys()) { 
        
                   builder.value(cursor.value); 
        
               } 
        
               builder.endArray(); 
        
               builder.startObject(IndexMetaData.KEY_PRIMARY_TERMS); 
        
               for (int shard = 0; shard < indexMetaData.getNumberOfShards(); shard++) { 
        
                   builder.field(Integer.toString(shard), indexMetaData.primaryTerm(shard)); 
        
               } 
        
               builder.endObject(); 
        
               builder.startObject(IndexMetaData.KEY_IN_SYNC_ALLOCATIONS); 
        
               for (IntObjectCursor<Set<String>> cursor : indexMetaData.getInSyncAllocationIds()) { 
        
                   builder.startArray(String.valueOf(cursor.key)); 
        
                   for (String allocationId : cursor.value) { 
        
                       builder.value(allocationId); 
        
                   } 
        
                   builder.endArray(); 
        
               } 
        
               builder.endObject(); 
        
               // index metadata 
        
               builder.endObject();

This duplicates, partially, IndexMetaData's toXContent:

elasticsearch/server/src/main/java/org/elasticsearch/cluster/metadata/IndexMetaData.java

Lines 1188 to 1248 in 6abacd2

    
           public static void toXContent(IndexMetaData indexMetaData, XContentBuilder builder, ToXContent.Params params) throws IOException { 
        
               builder.startObject(indexMetaData.getIndex().getName()); 
        
               builder.field(KEY_VERSION, indexMetaData.getVersion()); 
        
               builder.field(KEY_MAPPING_VERSION, indexMetaData.getMappingVersion()); 
        
               builder.field(KEY_SETTINGS_VERSION, indexMetaData.getSettingsVersion()); 
        
               builder.field(KEY_ALIASES_VERSION, indexMetaData.getAliasesVersion()); 
        
               builder.field(KEY_ROUTING_NUM_SHARDS, indexMetaData.getRoutingNumShards()); 
        
               builder.field(KEY_STATE, indexMetaData.getState().toString().toLowerCase(Locale.ENGLISH)); 
        
               boolean binary = params.paramAsBoolean("binary", false); 
        
               builder.startObject(KEY_SETTINGS); 
        
               indexMetaData.getSettings().toXContent(builder, new MapParams(Collections.singletonMap("flat_settings", "true"))); 
        
               builder.endObject(); 
        
               builder.startArray(KEY_MAPPINGS); 
        
               for (ObjectObjectCursor<String, MappingMetaData> cursor : indexMetaData.getMappings()) { 
        
                   if (binary) { 
        
                       builder.value(cursor.value.source().compressed()); 
        
                   } else { 
        
                       builder.map(XContentHelper.convertToMap(new BytesArray(cursor.value.source().uncompressed()), true).v2()); 
        
                   } 
        
               } 
        
               builder.endArray(); 
        
               for (ObjectObjectCursor<String, DiffableStringMap> cursor : indexMetaData.customData) { 
        
                   builder.field(cursor.key); 
        
                   builder.map(cursor.value); 
        
               } 
        
               builder.startObject(KEY_ALIASES); 
        
               for (ObjectCursor<AliasMetaData> cursor : indexMetaData.getAliases().values()) { 
        
                   AliasMetaData.Builder.toXContent(cursor.value, builder, params); 
        
               } 
        
               builder.endObject(); 
        
               builder.startArray(KEY_PRIMARY_TERMS); 
        
               for (int i = 0; i < indexMetaData.getNumberOfShards(); i++) { 
        
                   builder.value(indexMetaData.primaryTerm(i)); 
        
               } 
        
               builder.endArray(); 
        
               builder.startObject(KEY_IN_SYNC_ALLOCATIONS); 
        
               for (IntObjectCursor<Set<String>> cursor : indexMetaData.inSyncAllocationIds) { 
        
                   builder.startArray(String.valueOf(cursor.key)); 
        
                   for (String allocationId : cursor.value) { 
        
                       builder.value(allocationId); 
        
                   } 
        
                   builder.endArray(); 
        
               } 
        
               builder.endObject(); 
        
               builder.startObject(KEY_ROLLOVER_INFOS); 
        
               for (ObjectCursor<RolloverInfo> cursor : indexMetaData.getRolloverInfos().values()) { 
        
                   cursor.value.toXContent(builder, params); 
        
               } 
        
               builder.endObject(); 
        
               builder.endObject(); 
        
           }

Here's the code in ClusterState which serializes part of MetaData:

elasticsearch/server/src/main/java/org/elasticsearch/cluster/ClusterState.java

Lines 420 to 451 in 6abacd2

    
           if (metrics.contains(Metric.METADATA)) { 
        
               builder.startObject("metadata"); 
        
               builder.field("cluster_uuid", metaData().clusterUUID()); 
        
               builder.startObject("cluster_coordination"); 
        
               coordinationMetaData().toXContent(builder, params); 
        
               builder.endObject(); 
        
               builder.startObject("templates"); 
        
               for (ObjectCursor<IndexTemplateMetaData> cursor : metaData().templates().values()) { 
        
                   IndexTemplateMetaData templateMetaData = cursor.value; 
        
                   builder.startObject(templateMetaData.name()); 
        
                   builder.field("index_patterns", templateMetaData.patterns()); 
        
                   builder.field("order", templateMetaData.order()); 
        
                   builder.startObject("settings"); 
        
                   Settings settings = templateMetaData.settings(); 
        
                   settings.toXContent(builder, params); 
        
                   builder.endObject(); 
        
                   builder.startObject("mappings"); 
        
                   for (ObjectObjectCursor<String, CompressedXContent> cursor1 : templateMetaData.mappings()) { 
        
                       Map<String, Object> mapping = XContentHelper.convertToMap(new BytesArray(cursor1.value.uncompressed()), false).v2(); 
        
                       if (mapping.size() == 1 && mapping.containsKey(cursor1.key)) { 
        
                           // the type name is the root value, reduce it 
        
                           mapping = (Map<String, Object>) mapping.get(cursor1.key); 
        
                       } 
        
                       builder.field(cursor1.key); 
        
                       builder.map(mapping); 
        
                   } 
        
                   builder.endObject();

It seems like it would be good to delegate this to the classes' own toXContent methods so that we don't duplicate logic and end up with data missing from the GET _cluster/state API.

The text was updated successfully, but these errors were encountered:

elasticmachine · 2019-10-17T23:07:41Z

Pinging @elastic/es-core-infra (:Core/Infra/Core)

zacharymorn · 2019-11-07T02:27:38Z

Hi @gwbrown, your description is clear and I can pick up this one.

I took a look at ClusterState.toXContent, and see that the idea of delegation may be applicable to other sections of the code, even the corresponding Class.toXContent is not there yet. For example:

elasticsearch/server/src/main/java/org/elasticsearch/cluster/ClusterState.java

Lines 384 to 408 in 72e2d16

    
           if (metrics.contains(Metric.BLOCKS)) { 
        
               builder.startObject("blocks"); 
        
               if (!blocks().global().isEmpty()) { 
        
                   builder.startObject("global"); 
        
                   for (ClusterBlock block : blocks().global()) { 
        
                       block.toXContent(builder, params); 
        
                   } 
        
                   builder.endObject(); 
        
               } 
        
               if (!blocks().indices().isEmpty()) { 
        
                   builder.startObject("indices"); 
        
                   for (ObjectObjectCursor<String, Set<ClusterBlock>> entry : blocks().indices()) { 
        
                       builder.startObject(entry.key); 
        
                       for (ClusterBlock block : entry.value) { 
        
                           block.toXContent(builder, params); 
        
                       } 
        
                       builder.endObject(); 
        
                   } 
        
                   builder.endObject(); 
        
               } 
        
               builder.endObject(); 
        
           }

There’s already ClusterBlock.toXContent, shall we also add ClusterBlocks.toXContent?

elasticsearch/server/src/main/java/org/elasticsearch/cluster/ClusterState.java

Lines 411 to 417 in 72e2d16

    
           if (metrics.contains(Metric.NODES)) { 
        
               builder.startObject("nodes"); 
        
               for (DiscoveryNode node : nodes) { 
        
                   node.toXContent(builder, params); 
        
               } 
        
               builder.endObject(); 
        
           }

There’s already DicoveryNode.toXContent, shall we also also DicoveryNodes.toXContent?

elasticsearch/server/src/main/java/org/elasticsearch/cluster/ClusterState.java

Lines 520 to 538 in 72e2d16

    
           if (metrics.contains(Metric.ROUTING_TABLE)) { 
        
               builder.startObject("routing_table"); 
        
               builder.startObject("indices"); 
        
               for (IndexRoutingTable indexRoutingTable : routingTable()) { 
        
                   builder.startObject(indexRoutingTable.getIndex().getName()); 
        
                   builder.startObject("shards"); 
        
                   for (IndexShardRoutingTable indexShardRoutingTable : indexRoutingTable) { 
        
                       builder.startArray(Integer.toString(indexShardRoutingTable.shardId().id())); 
        
                       for (ShardRouting shardRouting : indexShardRoutingTable) { 
        
                           shardRouting.toXContent(builder, params); 
        
                       } 
        
                       builder.endArray(); 
        
                   } 
        
                   builder.endObject(); 
        
                   builder.endObject(); 
        
               } 
        
               builder.endObject(); 
        
               builder.endObject(); 
        
           }

There’s already ShardRouting.toXContent, shall we also add RoutingTable.toXContent, IndexRoutingTable.toXContent & IndexShardRoutingTable.toXContent ?

elasticsearch/server/src/main/java/org/elasticsearch/cluster/ClusterState.java

Lines 541 to 560 in 72e2d16

    
           if (metrics.contains(Metric.ROUTING_NODES)) { 
        
               builder.startObject("routing_nodes"); 
        
               builder.startArray("unassigned"); 
        
               for (ShardRouting shardRouting : getRoutingNodes().unassigned()) { 
        
                   shardRouting.toXContent(builder, params); 
        
               } 
        
               builder.endArray(); 
        
               builder.startObject("nodes"); 
        
               for (RoutingNode routingNode : getRoutingNodes()) { 
        
                   builder.startArray(routingNode.nodeId() == null ? "null" : routingNode.nodeId()); 
        
                   for (ShardRouting shardRouting : routingNode) { 
        
                       shardRouting.toXContent(builder, params); 
        
                   } 
        
                   builder.endArray(); 
        
               } 
        
               builder.endObject(); 
        
               builder.endObject(); 
        
           }

There’s already ShardRouting.toXContent, shall we also add RoutingNodes.toXContent & RoutingNode.toXContent ?

Applying the same approach there would make that section of code more consistent, but may increase the effort to a large extent. What do you think ?

gwbrown · 2019-11-07T14:38:15Z

It would be a good idea to switch all of them eventually, at least for the ones that already have their own toXContent, unless we find a good reason to do otherwise (omitting sensitive data or something), but we can do one PR for each to split up the work, rather than having to do it all as a single chunk. Thanks for finding the other places where we duplicate this logic!

For the ones that don't already have a toXContent, it might be a good idea to add one and use it in ClusterState, just so that each class owns its own serialization logic, but I think that's a lower priority than the ones that have duplicated logic.

zacharymorn · 2019-11-07T16:03:09Z

Ok make sense. I'll raise a PR in a few days.

zacharymorn · 2019-11-10T08:31:11Z

@gwbrown I opened a PR and took a first stab at delegating logic from ClusterState.toXContent to IndexMetaData.toXContent. To see what has been changed from both code and serialization, I purposefully separated the commits and the delta can be seen directly here 4850acc .

Not surprisingly, there are quite some differences. Is it ok for us to assume that IndexMetaData.toXContent logic is what we should go with and disregard that of ClusterState.toXContent, or this may break clients depending on the existing serialization logic?

zacharymorn · 2019-11-12T07:34:49Z

I pushed up more commits to show the changes in code & serialization in stages:

bfc1103 Subtitute relevant logic in ClusterState.toXContent with IndexTemplateMetaData.toXContent
9ce5915 Subtitute relevant logic in ClusterState.toXContent with MetaData.toXContent

gwbrown · 2019-11-12T19:28:47Z

Thanks @zacharymorn, I'll take a look at the PR soon.

(#52743) Delegates the toXContent serialization logic from the ClusterState class to its member classes, making the code in MetaData, IndexMetaData, ... aware of the serialization context (API vs GATEWAY). This makes it clearer what the differences are between API and GATEWAY serialization. Long-term, both should be unified. Co-authored-by: Yannick Welsch <yannick@welsch.lu>

…stic#48218 (elastic#52743) Delegates the toXContent serialization logic from the ClusterState class to its member classes, making the code in MetaData, IndexMetaData, ... aware of the serialization context (API vs GATEWAY). This makes it clearer what the differences are between API and GATEWAY serialization. Long-term, both should be unified. Co-authored-by: Yannick Welsch <yannick@welsch.lu>

williamrandolph · 2021-01-07T16:07:05Z

This was fixed by #52743.

gwbrown added >bug :Core/Infra/Core Core issues without another label labels Oct 17, 2019

rjernst added the help wanted adoptme label Oct 21, 2019

zacharymorn mentioned this issue Nov 10, 2019

Delegate toXContent logic from ClusterState to its member classes #48218 #48924

Closed

rjernst added the Team:Core/Infra Meta label for core/infra team label May 4, 2020

rjernst added the needs:triage Requires assignment of a team area label label Dec 3, 2020

williamrandolph closed this as completed Jan 7, 2021

williamrandolph removed the needs:triage Requires assignment of a team area label label Jan 7, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ClusterState's toXContent (imperfectly) duplicates logic from other classes' toXContent #48218

ClusterState's toXContent (imperfectly) duplicates logic from other classes' toXContent #48218

gwbrown commented Oct 17, 2019

elasticmachine commented Oct 17, 2019

zacharymorn commented Nov 7, 2019

gwbrown commented Nov 7, 2019

zacharymorn commented Nov 7, 2019

zacharymorn commented Nov 10, 2019

zacharymorn commented Nov 12, 2019

gwbrown commented Nov 12, 2019

williamrandolph commented Jan 7, 2021

ClusterState's toXContent (imperfectly) duplicates logic from other classes' toXContent #48218

ClusterState's toXContent (imperfectly) duplicates logic from other classes' toXContent #48218

Comments

gwbrown commented Oct 17, 2019

elasticmachine commented Oct 17, 2019

zacharymorn commented Nov 7, 2019

gwbrown commented Nov 7, 2019

zacharymorn commented Nov 7, 2019

zacharymorn commented Nov 10, 2019

zacharymorn commented Nov 12, 2019

gwbrown commented Nov 12, 2019

williamrandolph commented Jan 7, 2021