Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Default template incompatible with uv, rye, pdm #4116

Open
astrojuanlu opened this issue Aug 23, 2024 · 17 comments
Open

Default template incompatible with uv, rye, pdm #4116

astrojuanlu opened this issue Aug 23, 2024 · 17 comments
Assignees

Comments

@astrojuanlu
Copy link
Member

astrojuanlu commented Aug 23, 2024

Description

uv add <dep> fails with our default template (projects created with kedro new --tools).

(And so do rye and pdm)

Updated after tech design:
Things to do:

  • There was consensus on the idea of moving dependencies to the [project.dependencies] table of pyproject.toml
  • Add pip install -e . in requirements.txt
  • After a bit more discussion there was convergence around the idea of leaving a -e . with some comments + "environment dependencies" (for example ipython, pytest, ruff, which aren't needed to run the pipelines themselves)

Context

uv expects to be able to write to the [project.dependencies] table, but we're using dynamic dependencies, so in a way this is expected to fail:

Steps to Reproduce

uvx kedro new --name=kedro-telemetry-test-starter --tools=none --example=n
cd kedro-telemetry-test-starter
uv add kedro

Expected Result

Success (no changes to pyproject.toml, uv.lock is created)

Actual Result

Using Python 3.12.3 interpreter at: /opt/homebrew/opt/python@3.12/bin/python3.12
Creating virtualenv at: .venv
warning: No `requires-python` value found in the workspace. Defaulting to `>=3.12`.
error: Failed to build: `kedro-telemetry-test-starter @ file:///private/tmp/kedro-telemetry-test-starter`
  Caused by: Build backend failed to determine extra requires with `build_editable()` with exit status: 1
--- stdout:
configuration error: You cannot provide a value for `project.dependencies` and list it under `project.dynamic` at the same time
DESCRIPTION:
    According to PEP 621:  Build back-ends MUST raise an error if the metadata
    specifies a field statically as well as being listed in dynamic.

GIVEN VALUE:
    {
        "dependencies": [
            "kedro"
        ],
        "...": " # ...",
        "dynamic": [
            "dependencies",
            "version"
        ]
    }

OFFENDING RULE: 'PEP 621'

DEFINITION:
    {
        "see": "https://packaging.python.org/en/latest/specifications/pyproject-toml/#dynamic"
    }
--- stderr:
Traceback (most recent call last):
  File "<string>", line 14, in <module>
  File "/Users/juan_cano/.cache/uv/builds-v0/.tmpH11hii/lib/python3.12/site-packages/setuptools/build_meta.py", line 463, in get_requires_for_build_editable
    return self.get_requires_for_build_wheel(config_settings)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/juan_cano/.cache/uv/builds-v0/.tmpH11hii/lib/python3.12/site-packages/setuptools/build_meta.py", line 332, in get_requires_for_build_wheel
    return self._get_build_requires(config_settings, requirements=[])
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/juan_cano/.cache/uv/builds-v0/.tmpH11hii/lib/python3.12/site-packages/setuptools/build_meta.py", line 302, in _get_build_requires
    self.run_setup()
  File "/Users/juan_cano/.cache/uv/builds-v0/.tmpH11hii/lib/python3.12/site-packages/setuptools/build_meta.py", line 318, in run_setup
    exec(code, locals())
  File "<string>", line 1, in <module>
  File "/Users/juan_cano/.cache/uv/builds-v0/.tmpH11hii/lib/python3.12/site-packages/setuptools/__init__.py", line 111, in setup
    return distutils.core.setup(**attrs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/juan_cano/.cache/uv/builds-v0/.tmpH11hii/lib/python3.12/site-packages/setuptools/_distutils/core.py", line 158, in setup
    dist.parse_config_files()
  File "/Users/juan_cano/.cache/uv/builds-v0/.tmpH11hii/lib/python3.12/site-packages/_virtualenv.py", line 22, in parse_config_files
    result = old_parse_config_files(self, *args, **kwargs)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/juan_cano/.cache/uv/builds-v0/.tmpH11hii/lib/python3.12/site-packages/setuptools/dist.py", line 606, in parse_config_files
    pyprojecttoml.apply_configuration(self, filename, ignore_option_errors)
  File "/Users/juan_cano/.cache/uv/builds-v0/.tmpH11hii/lib/python3.12/site-packages/setuptools/config/pyprojecttoml.py", line 70, in apply_configuration
    config = read_configuration(filepath, True, ignore_option_errors, dist)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/juan_cano/.cache/uv/builds-v0/.tmpH11hii/lib/python3.12/site-packages/setuptools/config/pyprojecttoml.py", line 135, in read_configuration
    validate(subset, filepath)
  File "/Users/juan_cano/.cache/uv/builds-v0/.tmpH11hii/lib/python3.12/site-packages/setuptools/config/pyprojecttoml.py", line 59, in validate
    raise ValueError(f"{error}\n{summary}") from None
ValueError: invalid pyproject.toml config: `project.dependencies`.
configuration error: You cannot provide a value for `project.dependencies` and list it under `project.dynamic` at the same time

Your Environment

  • Kedro version used (pip show kedro or kedro -V): 0.19.8
  • Python version used (python -V): 3.12
  • Operating system and version: macOS Sonoma
@astrojuanlu astrojuanlu changed the title Default template incompatible with uv Default template incompatible with uv, rye, pdm Aug 23, 2024
@astrojuanlu
Copy link
Member Author

(Writing this with uv in mind but it applies to rye and pdm as well)

(For the record, the default template is also incompatible with Poetry, but for different reasons. This might change once Poetry 2.0 ships with PEP 621 support. For now, we're not making any effort to support Poetry, see #1722.)

Notice that there's not such thing as "migrating the template to uv": our pyproject.toml is already compatible with PEP 621 so there's nothing to migrate as long as we stay within the bounds of the standard. The problem is our use of experimental support for requirements.txt-powered dynamic dependencies in setuptools.

So there's several things we can do about this issue:

  • Nothing.
    • Less than ideal if uv gets popular (and I think it will).
  • Duplicate dependencies so they're both in pyproject.toml and requirements.txt.
    • Probably a mess to maintain, and also quite error prone.
  • Move all dependencies to pyproject.toml and remove requirements.txt.
  • Move all dependencies to pyproject.toml and replace the contents of requirements.txt with -e ..
    • This might look "weird" but it addresses the problem 👍🏼 and also keeps pip install -r requirements.txt workflows working 👍🏼 (with the addition that the project itself would be installed)

@Vaslo
Copy link

Vaslo commented Aug 24, 2024

Small sample of two people here but I will say after the most recent changes to uv I'd really love to see alignment between this project and uv. Makes 100% sense why Kedro is the way it is given the ubiquity and common practice of using pip, but I'm also seeing more projects move in the this better direction that rye/uv/poetry are taking. Thanks for thinking about it in the meantime.

@pietroppeter
Copy link

I am very new to both kedro and uv but I am wondering, is it possible as a first step to have a starter template that will have the same toml setup as uv? (see also #681 (comment))

@astrojuanlu
Copy link
Member Author

astrojuanlu commented Aug 27, 2024

Workaround from #681 (comment) :

$ uv init --lib
$ uvx kedro-init .
$ uv add kedro
$ uv run kedro registry list
$ uv run kedro pipeline create data_processing

is it possible as a first step to have a starter template that will have the same toml setup as uv?

In principle we'd like to avoid the proliferation of starters. Switching to a static [project.dependencies] table seems (hopefully) like a low-effort thing that would make our templates compatible with all modern workflow managers.

Watch this space and thanks for the patience 🙏🏼

@pietroppeter
Copy link

sounds reasonable (wishing to avoid templates).

as reported here #681 (comment) the workaround works for me and it seems that the rest of the missing things to add are straightforward (I honestly do not mind adding them one by one, it helps me understand better the system instead of using a template).

Thanks for the support, I am much more confident now in starting with uv and kedro together and will keep a look on how it evolve this. Have a great day!

@deepyaman
Copy link
Member

deepyaman commented Aug 27, 2024 via email

@astrojuanlu
Copy link
Member Author

Another workaround: https://github.com/astrojuanlu/copier-kedro

@astrojuanlu
Copy link
Member Author

I said users can be educated over a year ago in the thread you linked: #2569 (comment) I think it's well past time Kedro following modern best practices--this was always part of the value prop of the template. I don't like the weird requirements.txt with just -e . in there; if you want to document an alternative path in the FAQ, fine, but people shouldn't be defaulting to this weird flow.

@deepyaman Yeah and back then I was still on the fence...

On one hand, yes people should understand that library workflows need pip install -e ., full stop.

However,

  • Maybe this workflow is, and has always been, too low level. In fact, uv run automatically does it for you (and possibly other workflow tools too?). So maybe we shouldn't lean too heavily on that?
  • Kedro projects are libraries and applications. We currently (mis)use requirements.txt for unpinned dependencies, but there's a point to be made about having unpinned library dependencies in pyproject.toml, and a resolved version of those in requirements.txt. In fact, I was routinely using uv pip compile --universal pyproject.toml -o requirements.txt. Until we see the outcome of the ongoing PEP 751 discussions, requirements.txt is the de-facto standard for dependency locking, and I don't think it would be wise to completely remove it from Kedro templates.

That being said, I admit that my -e . proposal doesn't really use requirements.txt files as lockfiles. We could instead do something like

$ grep 'dependencies =' -A12 pyproject.toml
dependencies = [
  "ipython>=8.10",
  "jupyterlab>=3.0",
  "kedro~=0.19.8",
  "kedro-datasets[pandas-csvdataset, pandas-exceldataset, pandas-parquetdataset]>=3.0; python_version >= '3.9'",
  "kedro-datasets[pandas.CSVDataset, pandas.ExcelDataset, pandas.ParquetDataset]>=1.0; python_version < '3.9'",
  "kedro-viz>=6.7.0",
  "kedro[jupyter]",
  "notebook",
  "scikit-learn~=1.5.1; python_version >= '3.9'",
  "scikit-learn<=1.4.0,>=1.0; python_version < '3.9'",
]
dynamic = [ "version",]
$ cat requirements.in 
-e .
$ uv pip compile --universal requirements.in -o requirements.txt
...
$ head requirements.txt
# This file was autogenerated by uv via the following command:
#    uv pip compile --universal requirements.in -o requirements.txt
-e .
    # via -r requirements.in
aiofiles==24.1.0
    # via kedro-viz
annotated-types==0.7.0
    # via pydantic
antlr4-python3-runtime==4.9.3
    # via omegaconf

in other words, ship a fully resolved and universal requirements.txt in our template, and maintain the unpinned, library dependencies in pyproject.toml, hence fixing this issue and converging with the rest of the Python ecosystem.

@astrojuanlu
Copy link
Member Author

We discussed this on 2024-08-28 Tech Design. Some notes:

  • We clarified that it's only the uv/poetry/rye/pdm add workflow that breaks, users could potentially keep adding dependencies to their requirements.txt, currently consumed by setuptools
  • Although there was no support for the "do nothing" option
  • There was consensus on the idea of moving dependencies to the [project.dependencies] table of pyproject.toml, the only question left being what to do with requirements.txt
  • After a bit more discussion there was convergence around the idea of leaving a -e . with some comments + "environment dependencies" (for example ipython, pytest, ruff, which aren't needed to run the pipelines themselves)
    • This has the advantage of retaining the requirements.txt and, as @DimedS and @noklam pointed out, also the added benefit of installing the project itself in editable mode, which in turn helps with pytest and other tools
  • As an aside, @deepyaman and @idanov were of the opinion that even removing the requirements.txt technically this wouldn't be a breaking change, and that it would only affect newly created projects.
  • Still, @merelcht and others expressed some reluctance to the idea of outright removing the file, especially given that there's no standard lockfile format in Python (until PEP 751 or a successor are approved)
    • @noklam did some research and found that some projects have removed requirements.txt, but many others haven't (for example pytorch, polars. pandas, fastapi)
  • @merelcht also suggested creating or documenting an automatic process that would help users migrate their dependencies from requirements.txt to pyproject.toml
  • @DimedS suggested updating our docs with pip install [-e] . rather than pip install -r requirements.txt, but there wasn't enough time to discuss the implications of that
    • And in addition, if we do add -e . to requirements.txt, both workflows would be equivalent, save for "environment dependencies"

Extra links:

@astrojuanlu
Copy link
Member Author

@astrojuanlu
Copy link
Member Author

From the 2023 Python survey https://lp.jetbrains.com/python-developers-survey-2023/

image

@astrojuanlu
Copy link
Member Author

Another thing we should probably amend:

warning: No `requires-python` value found in the workspace. Defaulting to `>=3.10`.

Our template and starters should have a requires-python that's consistent with our Python version support policy

@Vaslo
Copy link

Vaslo commented Sep 2, 2024

Another workaround: https://github.com/astrojuanlu/copier-kedro

I ran the two commands to get the copier to run successfully and messed around to try and get uv working with Kedro. The only issue is that it didn't create a few of the folders like notebook and data (and it's child folders). Did I do something wrong or is there a different way to do this?

@astrojuanlu
Copy link
Member Author

@Vaslo good point. Try again :)

@Vaslo
Copy link

Vaslo commented Sep 3, 2024

@Vaslo good point. Try again :)

This edition is great! Thank you for taking a few minutes to do this. I'll work with this process until all the uv stuff is a little more mature. I'm having a few issues with a package on that end that I'm trying to iron out as well, but it's forcing me to learn both the Kedro and uv way.

@noklam
Copy link
Contributor

noklam commented Sep 9, 2024

Things to do:

  • There was consensus on the idea of moving dependencies to the [project.dependencies] table of pyproject.toml, the only question left being what to do with requirements.txt
  • After a bit more discussion there was convergence around the idea of leaving a -e . with some comments + "environment dependencies" (for example ipython, pytest, ruff, which aren't needed to run the pipelines themselves)

@astrojuanlu
Copy link
Member Author

A bit more data to support this: uv is now the third most common installer for Kedro in the entire history of the project, ahead of Poetry

(query)

image

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: To Do
Development

No branches or pull requests

8 participants
@astrojuanlu @pietroppeter @Vaslo @deepyaman @noklam @lrcouto @ankatiyar and others