Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GenAI project and code organization in Python #2858

Open
lmolkova opened this issue Sep 9, 2024 · 1 comment
Open

GenAI project and code organization in Python #2858

lmolkova opened this issue Sep 9, 2024 · 1 comment

Comments

@lmolkova
Copy link
Contributor

lmolkova commented Sep 9, 2024

We're trying to find the space for GenAI-related instrumentation libraries (related to the open-telemetry/community#2326 project).

TL;DR: we need to host OTel gen-ai instrumentations somewhere in otel, we're going to evolve them along with semantic conventions, we need to ship them more frequently and with different versions than other contrib components.

Where instrumentations live

Based on the discussions in Python SIG, we have the following options:

Option 1. In a separate repo

Pros:

  • Easy to have different release cadence / versioning
  • Gives more autonomy to GenAI community

Cons:

  • Harder to coordinate releases
  • Could lead to duplication for GenAI-oriented database instrumentations and other areas
  • Harder to have common code between genai and contrib instrumentations, add genai instrumentation to generic distro
  • Harder to enforce common policies and make sure that genai instrumentations are aligned with the rest of the ecosystem

Option 2. In the contrib, manual release per-component

Pros:

  • Resolves all the cons of Option 1
  • May improve general contrib tooling and related instrumentations (e.g. HTTP)

Cons:

  • Less autonomy
  • Manual release is done by the maintainer (needs PyPI permissions) - it does not scale

Option 3. In the contrib, but as a separate group of instrumentations

  • Needs some tooling changes to support different release schedule and versioning
  • Needs some process changes to support per-component ownership along with the approval powers

I'd like to explore Option 3 since it can potentially satisfy all the needs.

@lmolkova
Copy link
Contributor Author

lmolkova commented Sep 10, 2024

Option 3 proposal:

The scope is initially limited to any new instrumentation (gen-ai or not) and can be expanded to old instrumentations later.

Support per-package ownership with approval powers

Use component-owners and CODEOWNERS to assign reviewers AND to allow hypothetical python-genai-approvers to approve changes

  • To decide: should we require approval from general opentelemetry-python-contrib-approvers too ? We'll need to require 2 approval for everything.

Support per-package release model

Per-package ownership/versioning/releases is supported by otel-dotnet-contrib and otel-js-contrib, we can learn from them.

Each new component will have individual changelog

Central changelog may:

  • just mention that there are component-specific changelogs
  • we can fully switch to per-component changelog as JS-contrib did. We should consider changelog automation tooling to help.

Each new component will have individual version

  • Version will be either set manually when preparing new release OR determined by release automation such as "Release Please".

We should be able to release components individually or all-together

  • Release automation can look at modified versions and release components
    • which version was modified (if updating it manually)
    • or ones with changes since last release (if using release-please or similar automation)
  • To decide: When common contrib components are released, everything that depends on them should also be released.

Would it be harder to implement Option 3 comparing to the new repo?

If we do the new repo:

  • we will still need to learn how existing tooling works and adapt it to the new repo
  • if we copy the existing tooling over:
    • we won't have ability to release individual libs at different schedules
    • we won't have ability to have different versions of libs

Given the limitations and the complexity of creating a new repo + revisiting tooling decisions, I think there is no immediate gain in having a separate repo.

I'd propose to try release-please for new components including GenAI and expand it to existing instrumentations later.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant