Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(pip_repository): Support pip parse cycles #1166

Merged
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 5 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -59,6 +59,11 @@ A brief description of the categories of changes:
`data`. Note, that the `@pypi_foo//:pkg` labels are still present for
backwards compatibility.

* (pip_parse) The parameter `requirement_cycles` may be provided a map of names
to lists of requirements which form a dependency cycle. `pip_parse` will break
the cycle for you transparently. This behavior is also available under bzlmod
as `pip.parse(requirement_cycles={})`.

* (gazelle) The flag `use_pip_repository_aliases` is now set to `True` by
default, which will cause `gazelle` to change third-party dependency labels
from `@pip_foo//:pkg` to `@pip//foo` by default.
Expand Down
80 changes: 80 additions & 0 deletions docs/sphinx/pypi-dependencies.md
Original file line number Diff line number Diff line change
Expand Up @@ -130,6 +130,86 @@ Any 'extras' specified in the requirements lock file will be automatically added
as transitive dependencies of the package. In the example above, you'd just put
`requirement("useful_dep")`.

### Packaging cycles

Sometimes PyPi packages contain dependency cycles -- for instance `sphinx`
depends on `sphinxcontrib-serializinghtml`. When using them as `requirement()`s,
ala

```
py_binary(
name = "doctool",
...
deps = [
requirement("sphinx"),
]
)
```

Bazel will protest because it doesn't support cycles in the build graph --

```
ERROR: .../external/pypi_sphinxcontrib_serializinghtml/BUILD.bazel:44:6: in alias rule @pypi_sphinxcontrib_serializinghtml//:pkg: cycle in dependency graph:
//:doctool (...)
@pypi//sphinxcontrib_serializinghtml:pkg (...)
.-> @pypi_sphinxcontrib_serializinghtml//:pkg (...)
| @pypi_sphinxcontrib_serializinghtml//:_pkg (...)
| @pypi_sphinx//:pkg (...)
| @pypi_sphinx//:_pkg (...)
`-- @pypi_sphinxcontrib_serializinghtml//:pkg (...)
```

The `requirement_cycles` argument allows you to work around these issues by
specifying groups of packages which form cycles. `pip_parse` will transparently
fix the cycles for you and provide the cyclic dependencies simultaneously.

```
pip_parse(
...
requirement_cycles = {
"sphinx": [
"sphinx",
"sphinxcontrib-serializinghtml",
]
},
)
```

`pip_parse` supports fixing multiple cycles simultaneously, however cycles must
be distinct. `apache-airflow` for instance has dependency cycles with a number
arrdem marked this conversation as resolved.
Show resolved Hide resolved
of its optional dependencies, which means those optional dependencies must all
be a part of the `airflow` cycle. For instance --

```
pip_parse(
...
requirement_cycles = {
"airflow": [
"apache-airflow",
"apache-airflow-providers-common-sql",
"apache-airflow-providers-postgres",
"apache-airflow-providers-sqlite",
]
}
)
```
arrdem marked this conversation as resolved.
Show resolved Hide resolved

Alternatively, one could resolve the cycle by removing one leg of it.

For example while `apache-airflow-providers-sqlite` is "baked into" the Airflow
package, `apache-airflow-providers-postgres` is not and is an optional feature.
Rather than listing `apache-airflow[postgres]` in your `requirements.txt` which
would expose a cycle via the extra, one could either _manually_ depend on
`apache-airflow` and `apache-airflow-providers-postgres` separately as
requirements. Bazel rules which need only `apache-airflow` can take it as a
dependency, and rules which explicitly want to mix in
`apache-airflow-providers-postgres` now can.

Alternatively, one could use `rules_python`'s patching features to remove one
leg of the dependency manually. For instance by making
`apache-airflow-providers-postgres` not explicitly depend on `apache-airflow` or
perhaps `apache-airflow-providers-common-sql`.

## Consuming Wheel Dists Directly

If you need to depend on the wheel dists themselves, for instance, to pass them
Expand Down
1 change: 1 addition & 0 deletions examples/build_file_generation/BUILD.bazel
Original file line number Diff line number Diff line change
Expand Up @@ -65,6 +65,7 @@ py_library(
deps = [
"//random_number_generator",
"@pip//flask",
"@pip//sphinx",
],
)

Expand Down
13 changes: 13 additions & 0 deletions examples/build_file_generation/WORKSPACE
Original file line number Diff line number Diff line change
Expand Up @@ -94,6 +94,19 @@ load("@rules_python//python:pip.bzl", "pip_parse")
# You can instead check this `requirements.bzl` file into your repo.
pip_parse(
name = "pip",

# Requirement groups allow Bazel to tolerate PyPi cycles by putting dependencies
# which are known to form cycles into groups together.
experimental_requirement_cycles = {
"sphinx": [
"sphinx",
"sphinxcontrib-qthelp",
"sphinxcontrib-htmlhelp",
"sphinxcontrib-devhelp",
"sphinxcontrib-applehelp",
"sphinxcontrib-serializinghtml",
],
},
# (Optional) You can provide a python_interpreter (path) or a python_interpreter_target (a Bazel target, that
# acts as an executable). The latter can be anything that could be used as Python interpreter. E.g.:
# 1. Python interpreter that you compile in the build file.
Expand Down
1 change: 1 addition & 0 deletions examples/build_file_generation/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -14,6 +14,7 @@

from flask import Flask, jsonify
from random_number_generator import generate_random_number
import sphinx # noqa

app = Flask(__name__)

Expand Down
Loading