Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RFC: Revamp the repository modules structure #232

Closed
wants to merge 22 commits into from

Conversation

llucax
Copy link
Contributor

@llucax llucax commented Nov 2, 2023

This is based in #231, so please only look at the last commit. Also it might be useful to just see the branch code instead of the diff: https://github.com/llucax/frequenz-channels-python/tree/structure-rfc/src/frequenz/channels

These are the most notable changes:

  • The _base_classes modules is split into the _receiver and _sender modules.

  • Sender and receiver exceptions are moved from the _exceptions module to the new _sender and _receiver modules.

  • The frequenz.channel package now only exposes the new _receiver and _sender modules, and the _exceptions and _select modules.

  • All channels and receiver modules are moved to the frequenz.channels package and made public.

  • All public nested classes were moved to the top level of the corresponding module.

Advantages of this structure:

  • It completely removes circular dependencies.

  • It avoids importing unnecessary code. In Python importing means code execution, so even when it is done only at startup, it adds some overhead.

    Also by not importing unnecessary code, we can potentially add real optional dependencies. For example, if a project doesn't need to use a file watcher, they could avoid pulling the unnecessary awatch dependency. This is not done in this PR, but it could be done in the future.

  • By having the channels and receivers on their own module we can move public nested classes were moved to the top level of the corresponding module withough having to add superflous prefixes for support classes.

  • Removing nested classes avoids having to use hacky constructions, like requiring the use of from __future__ import annotations, types as strings (nested classes) and confusing the mkdocstrings tools when extracting and cross-linking docs.

  • The frequenz.channels package exposes all classes that are used once you have setted up your channels, so the importing should still be pretty terse in most cases and only frequenz.channels would need to be imported in modules only taking some receivers and iterating or selecting over them.

  • Makes docs easier to navigate/discover, as the Table of contents becomes much smaller for each module and the module list is shown in the left.

    Old:

    Old

    vs. New:

    image

There is no need to import `typing.Deque` in newer Python versions.

Signed-off-by: Leandro Lucarella <luca-frequenz@llucax.com>
We do this when the arguments could be easily mistake or hard to know
what they are for in the call site, for example when the type is a
`str` or `int`, and for most optional arguments.

Signed-off-by: Leandro Lucarella <luca-frequenz@llucax.com>
There is no reason to have them of `Any` type, anybody can explicitly
convert any object to `str` using `str(obj)` if they want to do so.

Signed-off-by: Leandro Lucarella <luca-frequenz@llucax.com>
`anext()` wasn't available in older Python versions.

Signed-off-by: Leandro Lucarella <luca-frequenz@llucax.com>
For the rest of the API we use full names instead of abbreviated names,
so better to do the same here for consistency and clarity.

Signed-off-by: Leandro Lucarella <luca-frequenz@llucax.com>
It might be useful to be able to test if a channel is closed or not.

Signed-off-by: Leandro Lucarella <luca-frequenz@llucax.com>
Signed-off-by: Leandro Lucarella <luca-frequenz@llucax.com>
Signed-off-by: Leandro Lucarella <luca-frequenz@llucax.com>
The `_limit` is already accesible via the `_deque`'s `maxsize`.

Signed-off-by: Leandro Lucarella <luca-frequenz@llucax.com>
`maxsize` comes from `deque` and it is not very clear, and doesn't
follow the snake_case convention.

Also exposes the `limit` for `Anycast` as a (read-only) property.

Signed-off-by: Leandro Lucarella <luca-frequenz@llucax.com>
This is only for debugging purposes, to show in the string
representation and logs, and it is added to match other channels.

Signed-off-by: Leandro Lucarella <luca-frequenz@llucax.com>
Since the `name` is only for debugging purposes, it shouldn't be
required. If no `name` is specified, a name will be created based on the
`id()` of the channel, so channels can be easily uniquely identified.

Also makes the `name` accessible via a read-only property.

Signed-off-by: Leandro Lucarella <luca-frequenz@llucax.com>
Signed-off-by: Leandro Lucarella <luca-frequenz@llucax.com>
Make the `latest` message accessible through a read-only property.

Signed-off-by: Leandro Lucarella <luca-frequenz@llucax.com>
UUIDs are expensive to create and not really necessary in this case as
we are just keeping a local list of unique objects, so using the
object's hash is enough.

We also use a `id(self)`-based name by default if a `name` was not
provided when creating a `new_receiver()`.

A simple test using `timeit` shows a 2 orders of magnitude improvement
in Python 3.11:

```console
$ python -m timeit -s "import uuid" "uuid.uuid4()"
100000 loops, best of 5: 2.67 usec per loop
$ python -m timeit "hash(__name__)"
10000000 loops, best of 5: 19.6 nsec per loop
$ python -m timeit "id(__name__)"
10000000 loops, best of 5: 20.1 nsec per loop
```

Signed-off-by: Leandro Lucarella <luca-frequenz@llucax.com>
This is only used to create the underlying `Broadcast` channel's names,
so instead of using 2 separate strings, just use a plain `name` as with
other channels. Also like with other channels, make the `name` optional
and default to a more readable `id(self)` representation (using `_`
separators).

Signed-off-by: Leandro Lucarella <luca-frequenz@llucax.com>
Use underscore separators to format the `id(self)`. Also only use the
default if `name` is `None`. Before an empty string would also be
changed to the default, but if an user passed an empty string, it is
better to leave it untouched.

Signed-off-by: Leandro Lucarella <luca-frequenz@llucax.com>
When debugging and logging objects it is very useful to get a
descriptive and clear string representation.

The new representation uses the class name and the user defined name
(if any) for `str` and a `repr` that tries to show how the class was
created but also important internal state.

Signed-off-by: Leandro Lucarella <luca-frequenz@llucax.com>

# ------------------------ >8 ------------------------
# Do not modify or remove the line above.
# Everything below it will be ignored.
#
# Conflicts:
#	src/frequenz/channels/_base_classes.py
#	src/frequenz/channels/_bidirectional.py
Now that we have nice string representations, it is no longer needed to
build one every time we log something.

Signed-off-by: Leandro Lucarella <luca-frequenz@llucax.com>
This is so it is closer to the other public attributes (properties) of
the class.

Also remove the mention to the default, as the default belongs to the
`__init__` argument docs.

Signed-off-by: Leandro Lucarella <luca-frequenz@llucax.com>
Signed-off-by: Leandro Lucarella <luca-frequenz@llucax.com>
@github-actions github-actions bot added part:docs Affects the documentation part:tests Affects the unit, integration and performance (benchmarks) tests part:channels Affects channels implementation part:core Affects the core types (`Sender`, `Receiver`, exceptions, etc.) labels Nov 2, 2023
@llucax
Copy link
Contributor Author

llucax commented Nov 2, 2023

BTW, I was thinking of merging (no pun intended) both Merge and MergeNamed into the merge module, but I didn't want to go for more changes in the RFC, we can do further fine-tuning if we agree to go with it.

These are the most notable changes:

- The `_base_classes` modules is split into the `_receiver` and
  `_sender` modules.

- Sender and receiver exceptions are moved from the `_exceptions` module
  to the new `_sender` and `_receiver` modules.

- The `frequenz.channel` package now only exposes the new `_receiver`
  and `_sender` modules, and the `_exceptions` and `_select` modules.

- All channels and receiver modules are moved to the `frequenz.channels`
  package and made public.

- All public nested classes were moved to the top level of the
  corresponding module.

Advantages of this structure:

- It completely removes circular dependencies.

- It avoids importing unnecessary code. In Python importing means code
  execution, so even when it is done only at startup, it adds some
  overhead.

  Also by not importing unnecessary code, we can potentially add real
  optional dependencies. For example, if a project doesn't need to use
  a file watcher, they could avoid pulling the unnecessary `awatch`
  dependency. This is not done in this PR, but it could be done in the
  future.

- By having the channels and receivers on their own module we can move
  public nested classes were moved to the top level of the corresponding
  module withough having to add superflous prefixes for support classes.

- Removing nested classes avoids having to use hacky constructions, like
  requiring the use of `from __future__ import annotations`, types as
  strings (nested classes) and confusing the `mkdocstrings` tools when
  extracting and cross-linking docs.

- The `frequenz.channels` package exposes all classes that are used once
  you have setted up your channels, so the importing should still be
  pretty terse in most cases and only `frequenz.channels` would need to
  be imported in modules only taking some receivers and iterating or
  selecting over them.

Signed-off-by: Leandro Lucarella <luca-frequenz@llucax.com>
@github-actions github-actions bot added the part:tooling Affects the development tooling (CI, deployment, dependency management, etc.) label Nov 2, 2023
@llucax
Copy link
Contributor Author

llucax commented Nov 6, 2023

So we discussed this and some people still preferred the more flat hierarchy, so we decided on the following:

  • Keep the _base_classes.py split.

    Rationale: Avoid circular dependencies.

  • Move all symbols to the top-level except from timer and file_watcher which are in their own module.

    Rationale:

    • None are real channels not utilities to work generically on channels, they are more like message generators from input from the outside world.
    • Having them in a separate module allows for making watchfiles an optional dependency.
    • Having each on its own module avoid having to add prefixes to support classes (like file_watcher.Event -> FileWatcherEvent).

    Alternatives: Keep both in the util or extensions package. Discarded because it will pull the watchfiles dependency even if you only want to use the Timer class. Also thinking about Google-style importsusing from frequenz.channels import extensions makes the module have a too generic name.

Although is not part of the restructuring itself, we also decided to:

Future work:

  • Make timer and file_watcher separate python packages: frequenz-channels-timer and frequenz-channels-file-watcher. This way the pulling of dependencies is more explicit and also it is clear that they are not considered core components.

    This might need support from repo-config, as we need to see how we generate docs, etc. so we leave it out of 1.0.

@llucax llucax self-assigned this Nov 7, 2023
@llucax
Copy link
Contributor Author

llucax commented Nov 8, 2023

Superseded by #235.

@llucax llucax closed this Nov 8, 2023
@llucax llucax added this to the v1.0.0 milestone Nov 8, 2023
@llucax llucax added the resolution:wontfix This will not be worked on label Nov 8, 2023
@llucax llucax deleted the structure-rfc branch January 16, 2024 08:23
@llucax llucax modified the milestones: v1.0.0, Dropped Jan 29, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
part:channels Affects channels implementation part:core Affects the core types (`Sender`, `Receiver`, exceptions, etc.) part:docs Affects the documentation part:tests Affects the unit, integration and performance (benchmarks) tests part:tooling Affects the development tooling (CI, deployment, dependency management, etc.) resolution:wontfix This will not be worked on
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant