Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

clarification needed for schemes that consume __annotations__ within class creation / init_subclass #17

Open
zzzeek opened this issue Oct 19, 2022 · 9 comments

Comments

@zzzeek
Copy link

zzzeek commented Oct 19, 2022

Hey there -

This is a SQLAlchemy 2.0 ORM mapping:

from __future__ import annotations

from sqlalchemy import ForeignKey
from sqlalchemy.orm import DeclarativeBase
from sqlalchemy.orm import Mapped
from sqlalchemy.orm import mapped_column
from sqlalchemy.orm import relationship


class Base(DeclarativeBase):
    pass

class Parent(Base):
    __tablename__ = "parent_table"

    id: Mapped[int] = mapped_column(primary_key=True)
    child_id: Mapped[int] = mapped_column(ForeignKey("child_table.id"))
    child: Mapped[Child] = relationship(back_populates="parents")


class Child(Base):
    __tablename__ = "child_table"

    id: Mapped[int] = mapped_column(primary_key=True)
    parents: Mapped[list[Parent]] = relationship(back_populates="child")

Above, SQLAlchemy 2.0 uses runtime inspection of __annotations__ to see things like "the Parent class has a link to the Child class" as well as "the Child class has a link to the Parent class".

I just read pep-649 and this is going to make things more difficult for us; probably not impossible, but very difficult to work around. Currently, the runtime inspection of __annotations__ occurs within the creation of the Parent and Child classes themselves, that is, using __init_subclass__(). Based on what I see in pep649, this would raise a TypeError (or something) because Parent class creation would consume __annotations__ and Child is not defined yet.

The irony here is that non-typed SQLAlchemy, which has been around for a decade and looks like this:

# SQLAlchemy pre pep-484 integration

class Parent(Base):
    __tablename__ = "parent_table"

    id = Column(Integer, primary_key=True)
    child_id = Column(ForeignKey("child_table.id"))
    child = relationship("Child", back_populates="parents")

above actually does defer the evaluation (but not the consumption, since it's just a string) of the "Child" string name until all classes have been defined. This made it easy for us to move to using __annotations__, since with ForwardRef("Child"), we get that same string just as we always have.

pep649 means we can no longer get that name at all, a name that's in the source code of the file, at all, until its importable. I'm not familiar with anything else in Python that works this way, even with "from foo import bar" you can override __import__() to see the strings coming in.

Has it been considered that if __annotations__ becomes an evaluable descriptor, that there would be some other way to simply get the raw names from a class without using eval()? That would keep things open for tools like SQLAlchemy that have spent a lot of time working with current approaches.

@ncoghlan
Copy link

This is a variation on #2 and another concrete example of the circular reference problem described in https://lukasz.langa.pl/61df599c-d9d8-4938-868b-36b67fdb4448/

Making forward references simple to use is looking to be a key part of making code object based lazy annotation evaluation offer a clear win over string based ones.

@zzzeek
Copy link
Author

zzzeek commented Oct 26, 2022

Hi Nick -

This is a variation on #2 and another concrete example of the circular reference problem described in https://lukasz.langa.pl/61df599c-d9d8-4938-868b-36b67fdb4448/

yes I observed that issue and this did look similar. I figured I'd put our exact case here in case there is any subtle discrepancies but also to highlight that being able to get at the annotations before they are evaluated is very important for existing libraries.

as for the blog post, I wasn't really understanding what it was getting at, as I'm not involved in the day-to-day typing discussions and it seemed to be pretty deep into it.

Making forward references simple to use is looking to be a key part of making code object based lazy annotation evaluation offer a clear win over string based ones.

OK, you lost me there as far as what point is being made. The point of pep-649 is to make forward references simple to use for code that consumes pep-484 annotations at runtime, that they do! by just removing the ForwardRef concept entirely. Which I agree, if your use cases don't involve cycles that you need to evaluate before the cycle has been fully declared, is great, I have a bunch of difficult code that is eval()ing ForwardRefs and yes it's not convenient.

However, "clear win", at the moment I have to disagree, because while we are getting rid of the awkwardness of ForwardRef, we are entirely removing the ability for object-reference annotations to be consumed at runtime within metaclasses or __init_subclass__, which to me indicates this pep is not really solving the problem of circular references at all, it's instead saying, "annotations can no longer be consumed at class creation time, all classes must be fully imported" (and not even within TYPE_CHECKING blocks). It is a major backwards-incompatible change for consuming libraries like SQLAlchemy if no contingencies are made for being able to consume annotations for objects/classes that aren't runtime defined yet.

I'm not a typing god, so I ask, what is pep-649's plan for class-based systems that want to consume __annotations__ within the __init_subclass__() method ? Is the answer, "rewrite your code to no longer use __init_subclass__()"? I'm just the implementor here so I need to know where this is going.

@JelleZijlstra
Copy link
Contributor

We'll probably end up with something like my idea from #1, where we provide a special form of eval() that produces a sentinel object of some sort for undefined names. I think that would cover your use case.

@zzzeek
Copy link
Author

zzzeek commented Oct 26, 2022

We'll probably end up with something like my idea from #1, where we provide a special form of eval() that produces a sentinel object of some sort for undefined names. I think that would cover your use case.

that would be great! Because yeah I just need the string. So...ForwardRef would be deprecated and all that?

@JelleZijlstra
Copy link
Contributor

Well, ForwardRef might be used as the mechanism for the eval hack, and for the foreseeable future you'd still have to deal with users who manually stringify annotations. I don't think the implementation details are exactly worked out, but I do think there will be a fairly straightforward way to support use cases like yours.

@ncoghlan
Copy link

Trying to unpack my original comment: ForwardRef uses come in two flavours, redundant and essential.

In the status quo, if a forward ref is needed at definition time, it has to be explicit and is hence always there in the annotation at runtime (either as a string or as a ForwardRef object), even if the code has reached a point where the reference could be resolved.

Both string and code object annotations remove the need for explicit forward references in the annotations themselves, but they don't remove the need to support "essential" forward references like those in your example: when "Parent" is evaluated it is clear "Child" is going to be a model class, but the class itself doesn't exist yet.

String annotations deal with that by pushing all evaluation responsibilities on to the consumer of the annotation.

As currently defined, code object annotations don't deal with the problem, but the discussions on #1 and #2 suggest the answer is likely to involve changing the way annotation evaluation deals with unresolved names.

@zzzeek
Copy link
Author

zzzeek commented Oct 29, 2022

In the status quo, if a forward ref is needed at definition time, it has to be explicit and is hence always there in the annotation at runtime (either as a string or as a ForwardRef object), even if the code has reached a point where the reference could be resolved.

so "status quo" here I assume does not include pep-563. I put from __future__ import annotations in all my code, and I don't need to use strings or worry about it anywhere. ForwardRef / strings are zero issue for the end user with pep563. it's just it remains inconvenient for libraries that consume annotations at runtime, which is mostly because ForwardRef has a bit of a mysterious API / low public footprint and there are not good functions to eval() them (pydantic is using private APIs for example). with better functions to do this evaluation directly, basically the same functionality that's proposed here as going inside of a descriptor, IMO that solves whatever problem there is.

All libraries that need to do runtime evaluation of annotated types are already dealing with ForwardRef, and IIUC those libraries are the only audience for this change. With easier standard lib functions in "typing", whatever burden there is here can be eased, without breaking the whole contract into something new and IMO more failure prone, as this PEP proposes.

String annotations deal with that by pushing all evaluation responsibilities on to the consumer of the annotation.

the proposal still does that. I call upon the descriptor form of __annotations__ and the eval then takes place. I have to invoke that function and deal with whatever fallout it presents. I would argue, hey just make that function separate, let me choose to run that error-prone function, or not.

per @JelleZijlstra it will be the other way around, the function will not be separate, instead the "separate" part will be some (as yet undefined) function to get the raw annotations. Seems backwards to me and introduces a whole new deprecation / backwards-compat-breaking stage that is otherwise unnecessary, but I will work with it.

As currently defined, code object annotations don't deal with the problem, but the discussions on #1 and #2 suggest the answer is likely to involve changing the way annotation evaluation deals with unresolved names.

count me in for simply making the annotation evaluation functions public and well tested. I dont see the need to make it all implicit, it will impose lots of backwards-incompatibility and workarounds on existing code, and I dont really see in what way it makes life easier for anyone. A large portion, if not majority, of libraries that consume annotations at runtime will have some version of the "circular reference" issue, and most will need to continue to access annotations in their string form provided they do their inspection during the class creation process.

@ncoghlan
Copy link

ncoghlan commented Oct 30, 2022

Excellent point, I've added that as a note on #19 (enabling a smooth transition from default annotations favours eager evaluation, but a smooth transition from string annotations favours lazy evaluation. Given existing library preparations for string annotations, the latter may be the preferable option even if it wouldn't be preferred with a clean slate)

@ncoghlan
Copy link

The accepted version of PEP 649 covers this (primarily in the section discussing backwards compatibility with PEP 563).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants