Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Codegen: proper casing for multiword attribute/metric names #599

Closed
lmolkova opened this issue Dec 12, 2023 · 4 comments
Closed

Codegen: proper casing for multiword attribute/metric names #599

lmolkova opened this issue Dec 12, 2023 · 4 comments
Assignees

Comments

@lmolkova
Copy link
Contributor

lmolkova commented Dec 12, 2023

We have a bunch of product names that consist of 2+ words as namespaces without _ between them:

  • cloudevents
  • webengine
  • cosmosdb
  • opentracing
    ...

they look fine in lowercase (cloudevents.event_id), but are not friendly to code generation.
E.g. Java and .NET would use CloudEvents.CLOUDEVENTS_EVENT_ID or CloudEvents.EventId constants.

Also, since code generation removes underscores at least for some languages, we should not allow to define namespaces/names that differ in _ only: e.g. cloud_events cannot be defined to avoid collision with cloudevents in generated code.

@lmolkova
Copy link
Contributor Author

lmolkova commented Dec 12, 2023

Options:

  1. Do nothing - if the product name looks ugly as Oneword, break it down to two_words ( -> TwoWords).
    • cons: it won't be possible to match original project/product name
  2. Define overrides centrally in build-tools
    • cons: build-tools + codegen happens too late (after new semconv version is released) and we'd need to update build-tools every time when new oneword namespace is introduced
  3. Add something like case_sensitive_name property to attribute/metric definition and allow semconv authors/owners to decide how codegen (or maybe backend visualizations) should look. By default, it would be oneword -> Oneword and two_words -> TwoWords
    • cons: we're introducing multiple representations of attribute/metric name
  4. Mix of 2 and 3
    • keep overrides inside semconv repo, but separate from attribute definition (e.g. as a config option for build-tools)
    • have a sample codegen as a part of semconv repo CI check
    • use the result of this run as an example demonstrating how attribute name would look like in codegen

@jsuereth
Copy link
Contributor

Do nothing - if the product name looks ugly as Oneword, break it down to two_words ( -> TwoWords).
cons: it won't be possible to match original project/product name

What does this mean?

My vote, although I may not understand the problem well enough, would be to just make the multiword products look awkward (only capitalize first letter) and also to have some kind of verification run on semconv repo (as part of PR submit) that can find conflicts that would prevent codegen.

I think that's option 1?

@trask
Copy link
Member

trask commented Feb 13, 2024

I think the (lowercase) attribute value is the important part. I think it's less important that the (uppercase) constant name or the (mixed) class name are pretty.

e.g., I think this would be ok:

  • db.system attribute value = cosmosdb
  • code constant = SemanticAttributes.COSMOSDB
  • code class name = Cosmosdb

@joaopgrassi joaopgrassi removed the Stale label Feb 14, 2024
@lmolkova
Copy link
Contributor Author

Thanks for the input. So, let's go ahead with option 1 - do nothing for the time being.

I don't think we need to add codegen checks.
Let's say we have cloudevents.* attribute - jinja template in language repo decides how it should be capitalized. It's fully in their hands and we can't test it centrally.

If we renamed cloudevents to cloud_events, that would be a breaking change on the spec level that we can detect without codegen (open-telemetry/build-tools#271).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants