Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix duplicate descriptions and short names #6371

Merged
merged 1 commit into from
Apr 16, 2020

Conversation

florimondmanca
Copy link
Contributor

@florimondmanca florimondmanca commented Apr 16, 2020

What does this PR do?

Fix duplicate descriptions and short names in confluent_platform/metadata.csv.

Motivation

Refs #5803, which added validate metadata --check-duplicates.

confluent-platform contributed a lot of errors, so I started with it (see below), but there are still 968 warnings that need fixing before that flag can be enforced.

Click to expand
$ ddev validate metadata confluent_platform --check-duplicates
confluent_platform:119 `The fraction of time the I/O thread spent doing I/O` is a duplicate description
confluent_platform:120 `The fraction of time the I/O thread spent waiting` is a duplicate description
confluent_platform:127 `Fetch requests may be throttled to meet quotas configured on the origin cluster. If these are non-zero, it indicates that the origin brokers are slowing the consumer down and the quotas configuration should be reviewed. For more information on quotas see Enforcing Client Quotas` is a duplicate description
confluent_platform:128 `The current number of active connections.` is a duplicate description
confluent_platform:129 `The number of network operations (reads or writes) on all connections per second` is a duplicate description
confluent_platform:130 `The number of requests sent per second` is a duplicate description
confluent_platform:131 `The number of responses received per second` is a duplicate description
confluent_platform:132 `The average request latency in ms` is a duplicate description
confluent_platform:134 `consumer fetch size avg` is a duplicate short_name
confluent_platform:135 `consumer fetch size max` is a duplicate short_name
confluent_platform:135 `The average number of bytes fetched per request.` is a duplicate description
confluent_platform:139 `consumer fetch rate` is a duplicate short_name
confluent_platform:139 `The number of fetch requests per second.` is a duplicate description
confluent_platform:142 `consumer fetch throttle time avg` is a duplicate short_name
confluent_platform:143 `consumer fetch throttle time max` is a duplicate short_name
confluent_platform:217 `Total number of active TCP connections.` is a duplicate description
confluent_platform:218 `The average rate per second of opened TCP connections.` is a duplicate description
confluent_platform:219 `The average rate per second of closed TCP connections.` is a duplicate description
confluent_platform:221 `The average number of requests per second that resulted in HTTP error responses` is a duplicate description
confluent_platform:222 `The average number of requests per second that resulted in HTTP error responses` is a duplicate description
confluent_platform:223 `The average number of requests per second that resulted in HTTP error responses` is a duplicate description
confluent_platform:224 `The average number of requests per second that resulted in HTTP error responses` is a duplicate description
confluent_platform:225 `The average number of requests per second that resulted in HTTP error responses` is a duplicate description
confluent_platform:226 `The average number of requests per second that resulted in HTTP error responses` is a duplicate description
confluent_platform:227 `consumer create err rate` is a duplicate short_name
confluent_platform:227 `The average number of requests per second that resulted in HTTP error responses` is a duplicate description
confluent_platform:228 `The average number of requests per second that resulted in HTTP error responses` is a duplicate description
confluent_platform:229 `consumer delete err rate` is a duplicate short_name
confluent_platform:229 `The average number of requests per second that resulted in HTTP error responses` is a duplicate description
confluent_platform:230 `The average number of requests per second that resulted in HTTP error responses` is a duplicate description
confluent_platform:231 `The average number of requests per second that resulted in HTTP error responses` is a duplicate description
confluent_platform:232 `The average number of requests per second that resulted in HTTP error responses` is a duplicate description
confluent_platform:233 `The average number of requests per second that resulted in HTTP error responses` is a duplicate description
confluent_platform:234 `The average number of requests per second that resulted in HTTP error responses` is a duplicate description
confluent_platform:235 `The average number of requests per second that resulted in HTTP error responses` is a duplicate description
confluent_platform:236 `The average number of requests per second that resulted in HTTP error responses` is a duplicate description
confluent_platform:237 `The average number of requests per second that resulted in HTTP error responses` is a duplicate description
confluent_platform:238 `The average number of requests per second that resulted in HTTP error responses` is a duplicate description
confluent_platform:239 `The average number of requests per second that resulted in HTTP error responses` is a duplicate description
confluent_platform:240 `The average number of requests per second that resulted in HTTP error responses` is a duplicate description
confluent_platform:241 `The average number of requests per second that resulted in HTTP error responses` is a duplicate description
confluent_platform:242 `The average number of requests per second that resulted in HTTP error responses` is a duplicate description
confluent_platform:243 `The average number of requests per second that resulted in HTTP error responses` is a duplicate description
confluent_platform:244 `The average number of requests per second that resulted in HTTP error responses` is a duplicate description
confluent_platform:245 `The average number of requests per second that resulted in HTTP error responses` is a duplicate description
confluent_platform:246 `partition get err rate` is a duplicate short_name
confluent_platform:246 `The average number of requests per second that resulted in HTTP error responses` is a duplicate description
confluent_platform:247 `The average number of requests per second that resulted in HTTP error responses` is a duplicate description
confluent_platform:248 `partition produce avro err rate` is a duplicate short_name
confluent_platform:248 `The average number of requests per second that resulted in HTTP error responses` is a duplicate description
confluent_platform:249 `The average number of requests per second that resulted in HTTP error responses` is a duplicate description
confluent_platform:250 `partition produce binary err rate` is a duplicate short_name
confluent_platform:250 `The average number of requests per second that resulted in HTTP error responses` is a duplicate description
confluent_platform:251 `The average number of requests per second that resulted in HTTP error responses` is a duplicate description
confluent_platform:252 `partition produce json err rate` is a duplicate short_name
confluent_platform:252 `The average number of requests per second that resulted in HTTP error responses` is a duplicate description
confluent_platform:253 `The average number of requests per second that resulted in HTTP error responses` is a duplicate description
confluent_platform:254 `partitions list err rate` is a duplicate short_name
confluent_platform:254 `The average number of requests per second that resulted in HTTP error responses` is a duplicate description
confluent_platform:255 `The average number of requests per second that resulted in HTTP error responses` is a duplicate description
confluent_platform:256 `The average number of requests per second that resulted in HTTP error responses` is a duplicate description
confluent_platform:257 `The average number of requests per second that resulted in HTTP error responses` is a duplicate description
confluent_platform:258 `The average number of requests per second that resulted in HTTP error responses` is a duplicate description
confluent_platform:259 `The average number of requests per second that resulted in HTTP error responses` is a duplicate description
confluent_platform:260 `The average number of requests per second that resulted in HTTP error responses` is a duplicate description
confluent_platform:261 `The average number of requests per second that resulted in HTTP error responses` is a duplicate description
confluent_platform:262 `The average number of requests per second that resulted in HTTP error responses` is a duplicate description
confluent_platform:263 `brokers list err rate` is a duplicate short_name
confluent_platform:263 `The average number of requests per second that resulted in HTTP error responses` is a duplicate description
confluent_platform:264 `consumer assign err rate` is a duplicate short_name
confluent_platform:264 `The average number of requests per second that resulted in HTTP error responses` is a duplicate description
confluent_platform:265 `consumer assignment err rate` is a duplicate short_name
confluent_platform:265 `The average number of requests per second that resulted in HTTP error responses` is a duplicate description
confluent_platform:266 `consumer commit offsets err rate` is a duplicate short_name
confluent_platform:266 `The average number of requests per second that resulted in HTTP error responses` is a duplicate description
confluent_platform:267 `consumer commit err rate` is a duplicate short_name
confluent_platform:267 `The average number of requests per second that resulted in HTTP error responses` is a duplicate description
confluent_platform:268 `consumer committed offsets err rate` is a duplicate short_name
confluent_platform:268 `The average number of requests per second that resulted in HTTP error responses` is a duplicate description
confluent_platform:269 `consumer create err rate` is a duplicate short_name
confluent_platform:269 `The average number of requests per second that resulted in HTTP error responses` is a duplicate description
confluent_platform:270 `consumer create err rate` is a duplicate short_name
confluent_platform:270 `The average number of requests per second that resulted in HTTP error responses` is a duplicate description
confluent_platform:271 `consumer delete err rate` is a duplicate short_name
confluent_platform:271 `The average number of requests per second that resulted in HTTP error responses` is a duplicate description
confluent_platform:272 `consumer delete err rate` is a duplicate short_name
confluent_platform:272 `The average number of requests per second that resulted in HTTP error responses` is a duplicate description
confluent_platform:273 `consumer records read avro err rate` is a duplicate short_name
confluent_platform:273 `The average number of requests per second that resulted in HTTP error responses` is a duplicate description
confluent_platform:274 `consumer records read binary err rate` is a duplicate short_name
confluent_platform:274 `The average number of requests per second that resulted in HTTP error responses` is a duplicate description
confluent_platform:275 `consumer records read json err rate` is a duplicate short_name
confluent_platform:275 `The average number of requests per second that resulted in HTTP error responses` is a duplicate description
confluent_platform:276 `consumer seek to beginning err rate` is a duplicate short_name
confluent_platform:276 `The average number of requests per second that resulted in HTTP error responses` is a duplicate description
confluent_platform:277 `consumer seek to end err rate` is a duplicate short_name
confluent_platform:277 `The average number of requests per second that resulted in HTTP error responses` is a duplicate description
confluent_platform:278 `consumer seek to offset err rate` is a duplicate short_name
confluent_platform:278 `The average number of requests per second that resulted in HTTP error responses` is a duplicate description
confluent_platform:279 `consumer subscribe err rate` is a duplicate short_name
confluent_platform:279 `The average number of requests per second that resulted in HTTP error responses` is a duplicate description
confluent_platform:280 `consumer subscription err rate` is a duplicate short_name
confluent_platform:280 `The average number of requests per second that resulted in HTTP error responses` is a duplicate description
confluent_platform:281 `consumer topic read avro err rate` is a duplicate short_name
confluent_platform:281 `The average number of requests per second that resulted in HTTP error responses` is a duplicate description
confluent_platform:282 `consumer topic read binary err rate` is a duplicate short_name
confluent_platform:282 `The average number of requests per second that resulted in HTTP error responses` is a duplicate description
confluent_platform:283 `consumer topic read json err rate` is a duplicate short_name
confluent_platform:283 `The average number of requests per second that resulted in HTTP error responses` is a duplicate description
confluent_platform:284 `consumer unsubscribe err rate` is a duplicate short_name
confluent_platform:284 `The average number of requests per second that resulted in HTTP error responses` is a duplicate description
confluent_platform:285 `partition consume avro err rate` is a duplicate short_name
confluent_platform:285 `The average number of requests per second that resulted in HTTP error responses` is a duplicate description
confluent_platform:286 `partition consume binary err rate` is a duplicate short_name
confluent_platform:286 `The average number of requests per second that resulted in HTTP error responses` is a duplicate description
confluent_platform:287 `partition consume json err rate` is a duplicate short_name
confluent_platform:287 `The average number of requests per second that resulted in HTTP error responses` is a duplicate description
confluent_platform:288 `partition get err rate` is a duplicate short_name
confluent_platform:288 `The average number of requests per second that resulted in HTTP error responses` is a duplicate description
confluent_platform:289 `partition get err rate` is a duplicate short_name
confluent_platform:289 `The average number of requests per second that resulted in HTTP error responses` is a duplicate description
confluent_platform:290 `partition produce avro err rate` is a duplicate short_name
confluent_platform:290 `The average number of requests per second that resulted in HTTP error responses` is a duplicate description
confluent_platform:291 `partition produce avro err rate` is a duplicate short_name
confluent_platform:291 `The average number of requests per second that resulted in HTTP error responses` is a duplicate description
confluent_platform:292 `partition produce binary err rate` is a duplicate short_name
confluent_platform:292 `The average number of requests per second that resulted in HTTP error responses` is a duplicate description
confluent_platform:293 `partition produce binary err rate` is a duplicate short_name
confluent_platform:293 `The average number of requests per second that resulted in HTTP error responses` is a duplicate description
confluent_platform:294 `partition produce json err rate` is a duplicate short_name
confluent_platform:294 `The average number of requests per second that resulted in HTTP error responses` is a duplicate description
confluent_platform:295 `partition produce json err rate` is a duplicate short_name
confluent_platform:295 `The average number of requests per second that resulted in HTTP error responses` is a duplicate description
confluent_platform:296 `partitions list err rate` is a duplicate short_name
confluent_platform:296 `The average number of requests per second that resulted in HTTP error responses` is a duplicate description
confluent_platform:297 `partitions list err rate` is a duplicate short_name
confluent_platform:297 `The average number of requests per second that resulted in HTTP error responses` is a duplicate description
confluent_platform:298 `err rate` is a duplicate short_name
confluent_platform:298 `The average number of requests per second that resulted in HTTP error responses` is a duplicate description
confluent_platform:299 `root get err rate` is a duplicate short_name
confluent_platform:299 `The average number of requests per second that resulted in HTTP error responses` is a duplicate description
confluent_platform:300 `root post err rate` is a duplicate short_name
confluent_platform:300 `The average number of requests per second that resulted in HTTP error responses` is a duplicate description
confluent_platform:301 `topic get err rate` is a duplicate short_name
confluent_platform:301 `The average number of requests per second that resulted in HTTP error responses` is a duplicate description
confluent_platform:302 `topic produce avro err rate` is a duplicate short_name
confluent_platform:302 `The average number of requests per second that resulted in HTTP error responses` is a duplicate description
confluent_platform:303 `topic produce binary err rate` is a duplicate short_name
confluent_platform:303 `The average number of requests per second that resulted in HTTP error responses` is a duplicate description
confluent_platform:304 `topic produce json err rate` is a duplicate short_name
confluent_platform:304 `The average number of requests per second that resulted in HTTP error responses` is a duplicate description
confluent_platform:305 `topics list err rate` is a duplicate short_name
confluent_platform:305 `The average number of requests per second that resulted in HTTP error responses` is a duplicate description
confluent_platform:343 `The average value of commit-latency.` is a duplicate description
Validated!

Additional Notes

Used this script to generate the metric names for HTTP errors:

Click to expand
metrics = """
confluent.kafka.rest.jersey.brokers.list.request_error_rate
confluent.kafka.rest.jersey.consumer.assign_v2.request_error_rate
confluent.kafka.rest.jersey.consumer.assignment_v2.request_error_rate
confluent.kafka.rest.jersey.consumer.commit_offsets_v2.request_error_rate
confluent.kafka.rest.jersey.consumer.commit.request_error_rate
confluent.kafka.rest.jersey.consumer.committed_offsets_v2.request_error_rate
confluent.kafka.rest.jersey.consumer.create_v2.request_error_rate
confluent.kafka.rest.jersey.consumer.create.request_error_rate
confluent.kafka.rest.jersey.consumer.delete_v2.request_error_rate
confluent.kafka.rest.jersey.consumer.delete.request_error_rate
confluent.kafka.rest.jersey.consumer.records.read_avro_v2.request_error_rate
confluent.kafka.rest.jersey.consumer.records.read_binary_v2.request_error_rate
confluent.kafka.rest.jersey.consumer.records.read_json_v2.request_error_rate
confluent.kafka.rest.jersey.consumer.seek_to_beginning_v2.request_error_rate
confluent.kafka.rest.jersey.consumer.seek_to_end_v2.request_error_rate
confluent.kafka.rest.jersey.consumer.seek_to_offset_v2.request_error_rate
confluent.kafka.rest.jersey.consumer.subscribe_v2.request_error_rate
confluent.kafka.rest.jersey.consumer.subscription_v2.request_error_rate
confluent.kafka.rest.jersey.consumer.topic.read_avro.request_error_rate
confluent.kafka.rest.jersey.consumer.topic.read_binary.request_error_rate
confluent.kafka.rest.jersey.consumer.topic.read_json.request_error_rate
confluent.kafka.rest.jersey.consumer.unsubscribe_v2.request_error_rate
confluent.kafka.rest.jersey.partition.consume_avro.request_error_rate
confluent.kafka.rest.jersey.partition.consume_binary.request_error_rate
confluent.kafka.rest.jersey.partition.consume_json.request_error_rate
confluent.kafka.rest.jersey.partition.get_v2.request_error_rate
confluent.kafka.rest.jersey.partition.get.request_error_rate
confluent.kafka.rest.jersey.partition.produce_avro_v2.request_error_rate
confluent.kafka.rest.jersey.partition.produce_avro.request_error_rate
confluent.kafka.rest.jersey.partition.produce_binary_v2.request_error_rate
confluent.kafka.rest.jersey.partition.produce_binary.request_error_rate
confluent.kafka.rest.jersey.partition.produce_json_v2.request_error_rate
confluent.kafka.rest.jersey.partition.produce_json.request_error_rate
confluent.kafka.rest.jersey.partitions.list_v2.request_error_rate
confluent.kafka.rest.jersey.partitions.list.request_error_rate
confluent.kafka.rest.jersey.request_error_rate
confluent.kafka.rest.jersey.root.get.request_error_rate
confluent.kafka.rest.jersey.root.post.request_error_rate
confluent.kafka.rest.jersey.topic.get.request_error_rate
confluent.kafka.rest.jersey.topic.produce_avro.request_error_rate
confluent.kafka.rest.jersey.topic.produce_binary.request_error_rate
confluent.kafka.rest.jersey.topic.produce_json.request_error_rate
confluent.kafka.rest.jersey.topics.list.request_error_rate
confluent.kafka.schema.registry.jersey.brokers.list.request_error_rate
confluent.kafka.schema.registry.jersey.consumer.assign_v2.request_error_rate
confluent.kafka.schema.registry.jersey.consumer.assignment_v2.request_error_rate
confluent.kafka.schema.registry.jersey.consumer.commit_offsets_v2.request_error_rate
confluent.kafka.schema.registry.jersey.consumer.commit.request_error_rate
confluent.kafka.schema.registry.jersey.consumer.committed_offsets_v2.request_error_rate
confluent.kafka.schema.registry.jersey.consumer.create_v2.request_error_rate
confluent.kafka.schema.registry.jersey.consumer.create.request_error_rate
confluent.kafka.schema.registry.jersey.consumer.delete_v2.request_error_rate
confluent.kafka.schema.registry.jersey.consumer.delete.request_error_rate
confluent.kafka.schema.registry.jersey.consumer.records.read_avro_v2.request_error_rate
confluent.kafka.schema.registry.jersey.consumer.records.read_binary_v2.request_error_rate
confluent.kafka.schema.registry.jersey.consumer.records.read_json_v2.request_error_rate
confluent.kafka.schema.registry.jersey.consumer.seek_to_beginning_v2.request_error_rate
confluent.kafka.schema.registry.jersey.consumer.seek_to_end_v2.request_error_rate
confluent.kafka.schema.registry.jersey.consumer.seek_to_offset_v2.request_error_rate
confluent.kafka.schema.registry.jersey.consumer.subscribe_v2.request_error_rate
confluent.kafka.schema.registry.jersey.consumer.subscription_v2.request_error_rate
confluent.kafka.schema.registry.jersey.consumer.topic.read_avro.request_error_rate
confluent.kafka.schema.registry.jersey.consumer.topic.read_binary.request_error_rate
confluent.kafka.schema.registry.jersey.consumer.topic.read_json.request_error_rate
confluent.kafka.schema.registry.jersey.consumer.unsubscribe_v2.request_error_rate
confluent.kafka.schema.registry.jersey.partition.consume_avro.request_error_rate
confluent.kafka.schema.registry.jersey.partition.consume_binary.request_error_rate
confluent.kafka.schema.registry.jersey.partition.consume_json.request_error_rate
confluent.kafka.schema.registry.jersey.partition.get_v2.request_error_rate
confluent.kafka.schema.registry.jersey.partition.get.request_error_rate
confluent.kafka.schema.registry.jersey.partition.produce_avro_v2.request_error_rate
confluent.kafka.schema.registry.jersey.partition.produce_avro.request_error_rate
confluent.kafka.schema.registry.jersey.partition.produce_binary_v2.request_error_rate
confluent.kafka.schema.registry.jersey.partition.produce_binary.request_error_rate
confluent.kafka.schema.registry.jersey.partition.produce_json_v2.request_error_rate
confluent.kafka.schema.registry.jersey.partition.produce_json.request_error_rate
confluent.kafka.schema.registry.jersey.partitions.list_v2.request_error_rate
confluent.kafka.schema.registry.jersey.partitions.list.request_error_rate
confluent.kafka.schema.registry.jersey.request_error_rate
confluent.kafka.schema.registry.jersey.root.get.request_error_rate
confluent.kafka.schema.registry.jersey.root.post.request_error_rate
confluent.kafka.schema.registry.jersey.topic.get.request_error_rate
confluent.kafka.schema.registry.jersey.topic.produce_avro.request_error_rate
confluent.kafka.schema.registry.jersey.topic.produce_binary.request_error_rate
confluent.kafka.schema.registry.jersey.topic.produce_json.request_error_rate
confluent.kafka.schema.registry.jersey.topics.list.request_error_rate
""".strip().splitlines()

def verbosify(metric: str) -> str:
    parts = metric.split('.')
    jersey_index = parts.index('jersey')
    request_error_rate_index = parts.index('request_error_rate')
    parts = parts[jersey_index + 1:request_error_rate_index]
    parts = [part.replace('_', ' ') for part in parts]
    return ' '.join(parts)

items = [verbosify(metric) for metric in metrics]

descriptions = []
for metric, item in zip(metrics, items):
    resource = "HTTP requests" if "kafka.rest" in metric else "operations"
    descriptions.append(f"The average rate of failed {item + (' ' if item else '')}{resource}")

for d in descriptions:
    print(d)

Review checklist (to be filled by reviewers)

  • Feature or bugfix MUST have appropriate tests (unit, integration, e2e)
  • PR title must be written as a CHANGELOG entry (see why)
  • Files changes must correspond to the primary purpose of the PR as described in the title (small unrelated changes should have their own PR)
  • PR must have changelog/ and integration/ labels attached

@florimondmanca florimondmanca force-pushed the florimondmanca/cp-dedup-metadatacsv branch from 8167118 to aafdd7d Compare April 16, 2020 15:23
Copy link
Member

@AlexandreYang AlexandreYang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM Thx 🙇

@florimondmanca florimondmanca merged commit 25f06ec into master Apr 16, 2020
@florimondmanca florimondmanca deleted the florimondmanca/cp-dedup-metadatacsv branch April 16, 2020 15:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants