Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use less magic in constructing our public API exports, to make trio more intelligible to static analysis #542

Open
ksamuel opened this issue May 30, 2018 · 25 comments

Comments

@ksamuel
Copy link

ksamuel commented May 30, 2018

Despite the well defined __all__, several important tools (e.g: pylint, mypy, jedi) cannot understand the code structure of trio. They then report error in the user code.

E.G:

On the following snippet, pylint will report trio not having the run() member, jedi won't allow code completion, vscode intellisense won't let you go to definition and mypy won't report any error if I try to add +1 after run()

import trio
...
trio.run(parent)

I understand this is inconvenient, as those tools should be able to pick up the __all__ declaration and use that instead of requiring your to type it all again, but they don't. Since they are the most popular in the Python ecosystem, and are used by a lot of editors (vi, sublime text, vscode, etc) but also in tasks (tox, travis, git hooks, etc), it's important that they work.

For now, that means putting # noqa a bit everywhere or disabling a lot of warning, which is not practical.

@ksamuel ksamuel changed the title Replace import * in root __init__.py with explicit import Replace import * in root __init__.py with explicit imports May 30, 2018
@parity3
Copy link

parity3 commented Jun 2, 2018

FYI, there has been some work on this in the past. Maybe there could be a program to read import modules, introspect and spit out a new set of sources that are more static?

python make-nice-for.py --tool=pylint --output-path=new-trio-sources/

@njsmith
Copy link
Member

njsmith commented Jun 6, 2018

Yeah, we should probably give up on most of this cleverness and switch to doing the more boring approach.

There was some discussion of this in the chat today also, starting about here: https://gitter.im/python-trio/general?at=5b1842e099fa7f4c064e93a0

This is probably best handled as an incremental set of cleanups, since there are a lot of different subtle bits of trickery here. "Does this cleanup help pylint/mypy/pycharm/jedi understand what's going on" is probably a good metric to use to decide where to start...

Also relevant: #475, which I think is a change we probably should do, and will reduce the number of auto-generated wrapper functions in trio/_core/_run.py.

@njsmith njsmith changed the title Replace import * in root __init__.py with explicit imports Use less magic in constructing our public API exports, to make trio more intelligible to static analysis Jun 6, 2018
@njsmith
Copy link
Member

njsmith commented Aug 23, 2018

This comment describes how to write a strong test for pylint being able to understand our public API: #594 (comment)

@Zac-HD
Copy link
Member

Zac-HD commented Sep 8, 2018

Closed by #594 🎉

@Zac-HD Zac-HD closed this as completed Sep 8, 2018
@njsmith
Copy link
Member

njsmith commented Sep 8, 2018

I like your enthusiasm, but I'm afraid we're not done yet :-). #594 made the main trio namespace work with tab completion, but we still need to figure out what to do with the submodules, and think about whether we can get rid of any of the other runtime namespace manipulation that we do...

@njsmith njsmith reopened this Sep 8, 2018
@Zac-HD
Copy link
Member

Zac-HD commented Sep 9, 2018

Haha, I actually just went and updated my "add type hints" branch (master...Zac-HD:type-hints)... it's still mostly about ignoring analysis failures, so I was definitely too enthusiastic 🤣

@njsmith
Copy link
Member

njsmith commented Sep 10, 2018

So if I understand the status correctly: the main trio, trio.testing, trio._core namespaces are now amenable to static analysis. Meaning: common tools can look at them and (a) generate a list of exported symbols, for tab completion, and (b) figure out where those symbols came from, so deeper analysis is possible if the place they're coming from is itself amenable to static analysis.

There are a few more public namespaces to look at:

  • trio.hazmat – has some hacks to make the list of symbols visible, but there's no way this is intelligible to tools that need to do deeper analysis like mypy
  • trio.abc, trio.ssl – probably need the same treatment as trio/__init__.py
  • trio.socket – this is messy right now. Probably we should replace most of the dynamic __all__ generation inside trio/_socket.py with a static list import list in trio/socket.py, similar to what we did for trio/__init__.py. The special case is constants like AF_INET, which more-or-less have to be dynamic. Maybe: move the dynamic re-export loop out of trio/_socket.py and into trio/socket.py; inside trio/_socket.py refer to these constants as _stdlib_socket.AF_INET etc. so mypy knows what's going on within that file anyway; and in trio/socket.py statically list some of the key constants like AF_INET, AF_INET6, etc.?
  • trio/_core/*.py – the big issue is the dynamic code generation in trio/_core/_run.py. (Are there any other issues?) Implementing Consider converting lowest-level I/O primitives from global functions into methods #475 would reduce the number of dynamically-generated functions, but not eliminate them. Maybe we should switch to static code generation? For example, we could have a script we run occasionally to generate a trio/_core/_autogenerated.py file that we check into git. (We might also want to do this for the methods on trio.socket._SocketType, since the current metaprogramming approach is confusing to both humans and tools, and also adds overhead to methods that get called a lot in inner loops.)

Does that sounds like the current status?

@Fuyukai
Copy link
Member

Fuyukai commented Sep 10, 2018

but there's no way

No way what?

@Zac-HD
Copy link
Member

Zac-HD commented Sep 10, 2018

Yep, that's my understanding too.

trio.socket – this is messy right now. Probably we should replace most of the dynamic __all__ generation inside trio/_socket.py with a static list import list in trio/socket.py, similar to what we did for trio/__init__.py.

Sounds good to me. We might also need to make the namespace construction less magical, but we need static imports first anyway and can see what actually has to change after that.

The special case is constants like AF_INET, which more-or-less have to be dynamic. Maybe: move the dynamic re-export loop out of trio/_socket.py and into trio/socket.py; inside trio/_socket.py refer to these constants as _stdlib_socket.AF_INET etc. so mypy knows what's going on within that file anyway; and in trio/socket.py statically list some of the key constants like AF_INET, AF_INET6, etc.?

This also sounds good, though I'd be very cautious about statically listing dynamic constants - it might be the best way currently available, but we should leave some detailed comments about what's going on.

Maybe we should switch to static code generation?

Yes, but only if we have CI always check that there is no difference between the current output of the generator and the file tracked by git. This catches both stale files when the generator has been updated, and contributors trying to update the files by hand - both recipes for bugs down the line!

@njsmith
Copy link
Member

njsmith commented Sep 11, 2018

@Fuyukai good catch, thanks, I edited the post to add the missing words

@Zac-HD
Copy link
Member

Zac-HD commented Sep 11, 2018

Just dropping a quick update on my type-hints branch: moving in the right direction, but plenty left to do.
Note that Mypy is only part of the scope of this issue!

Current Mypy output
trio/_path.py:183: error: "Type[Path]" has no attribute "absolute"
trio/_path.py:187: error: "Type[_PathLike[Any]]" has no attribute "register"
trio/_highlevel_open_tcp_stream.py:4: error: Module 'trio.socket' has no attribute 'getaddrinfo'
trio/_highlevel_open_tcp_stream.py:4: error: Module 'trio.socket' has no attribute 'SOCK_STREAM'
trio/_highlevel_open_tcp_stream.py:4: error: Module 'trio.socket' has no attribute 'socket'
trio/_highlevel_open_unix_stream.py:3: error: Module 'trio.socket' has no attribute 'socket'
trio/_highlevel_open_unix_stream.py:3: error: Module 'trio.socket' has no attribute 'SOCK_STREAM'
trio/_highlevel_open_unix_stream.py:6: error: Module 'trio.socket' has no attribute 'AF_UNIX'
trio/__init__.py:18: error: Module 'trio._core' has no attribute 'current_time'
trio/__init__.py:98: error: Module has no attribute "__deprecated_attributes__"
trio/testing/__init__.py:1: error: Module 'trio._core' has no attribute 'wait_all_tasks_blocked'
trio/tests/test_util.py:90: error: Invalid base class
trio/tests/test_socket.py:359: error: Module has no attribute "AF_INET"
trio/tests/test_socket.py:360: error: Module has no attribute "AF_INET6"
trio/tests/test_path.py:183: error: "Type[Path]" has no attribute "joinpath"
trio/tests/test_highlevel_open_tcp_stream.py:6: error: Module 'trio.socket' has no attribute 'AF_INET'
trio/tests/test_highlevel_open_tcp_stream.py:6: error: Module 'trio.socket' has no attribute 'AF_INET6'
trio/tests/test_highlevel_open_tcp_stream.py:6: error: Module 'trio.socket' has no attribute 'SOCK_STREAM'
trio/tests/test_highlevel_open_tcp_stream.py:6: error: Module 'trio.socket' has no attribute 'IPPROTO_TCP'
trio/tests/test_highlevel_open_tcp_stream.py:101: error: Name 'trio.socket.SocketType' is not defined
trio/tests/test_highlevel_open_tcp_listeners.py:165: error: Name 'tsocket.SocketType' is not defined
trio/tests/module_with_deprecations.py:12: error: Module has no attribute "regular"
trio/_subprocess/unix_pipes.py:60: error: "memoryview" has no attribute "__enter__"; maybe "__iter__"?
trio/_subprocess/unix_pipes.py:60: error: "memoryview" has no attribute "__exit__"
trio/_subprocess/unix_pipes.py:78: error: Module has no attribute "wait_writable"
trio/_subprocess/unix_pipes.py:109: error: Module has no attribute "wait_readable"
trio/_subprocess/linux_waitpid.py:52: error: Module has no attribute "spawn_system_task"
trio/tests/test_highlevel_ssl_helpers.py:8: error: Module 'trio.socket' has no attribute 'AF_INET'
trio/tests/test_highlevel_ssl_helpers.py:8: error: Module 'trio.socket' has no attribute 'SOCK_STREAM'
trio/tests/test_highlevel_ssl_helpers.py:8: error: Module 'trio.socket' has no attribute 'IPPROTO_TCP'
  • the socket and _run dynamism discussed above are the biggest contributors by numbers
  • trip/_path.py also has some over-magical re-exporting and wrapping
  • I've fixed the memoryview issue upstream, so Mypy 6.30 will know it's a context manager

So once those things are sorted out, we can run Mypy in CI! Then:

  1. remove as many type: ignore comments as possible (forcing the Any type hides errors elsewhere)
  2. ship the py.typed file, so downstream code can be checked for consistency with Trio's type hints
  3. add non-trivial type hints to Trio; profit.

@Zac-HD Zac-HD closed this as completed Sep 11, 2018
@Zac-HD
Copy link
Member

Zac-HD commented Sep 11, 2018

Gah, misclick - sorry!

@Zac-HD
Copy link
Member

Zac-HD commented Sep 13, 2018

@njsmith says: Staircase review: [#656] should have fixed tab completion for trio.testing and trio._core... it'd be nice to extend the jedi/pylint tests to make sure that stays true :-)

Extending the tests from #594 to cover the changes in #656 would be a nice, self-contained contribution.

njsmith pushed a commit that referenced this issue Sep 29, 2018
This should make trio.socket more intelligible to static analysis tools (see gh-542).
@njsmith
Copy link
Member

njsmith commented Sep 29, 2018

Opened #699 to discuss the sub-issue of how to handle socket module constants.

@njsmith
Copy link
Member

njsmith commented Oct 7, 2018

Updated todo list:

I think that's it for module-level exports. Then there are some classes that use dynamic shenanigans on the attribute level:

  • trio._ssl.SSLStream: uses __getattr__/__setattr__ to directly forward a bunch of attributes to self._ssl_object. Replace with... lots of manually-written forwarding methods and/or @propertys? Or use code generation to write them? Use a .pyi file?
  • trio._socket._SocketType: uses __getattr__ to forward several attributes to self._sock. Also uses more elaborate metaprogramming to generate most of the major methods (_make_simple_sock_method_wrapper, _nonblocking_helper...). These should probably be replaced by code generation for efficiency as well as analyzability – the abstraction layers show up in runtime profiling. The SocketType / _SocketType distinction will also be fun for mypy, and all the if hasattr(stdlib_socket.socket, ...). This will need a dedicated issue to discuss.
  • The async path/file wrappers are the most dynamic thing in trio, and going to be super annoying to deal with. Mypy plugin? (Our open is even worse than the builtin open, and mypy needs a plugin to handle the builtin open.) This will need a dedicated issue to discuss.

@jmfrank63
Copy link
Contributor

Hi I think I at least nominally covered the list. So if someone would like to take a look they are #751 #752 and #753
The tests seem to go fine, but we have to keep track of the symbols twice, once in the try/except import and second on the dynamic import. But since they are in one module next to each other that might be acceptable. We might add a comment for this.

njsmith added a commit that referenced this issue Dec 13, 2018
migrate trio.ssl from star import to explicit imports, to aid static analysis (issue #542)
njsmith added a commit that referenced this issue Dec 15, 2018
Migrate hazmat to faked explicit importing and reexporting, to aid static analysis (issue #542)
njsmith pushed a commit that referenced this issue Jun 23, 2019
…analysis issue(#542) (#805)

Switch from runtime code generation to static ahead-of-time code generation.
@njsmith
Copy link
Member

njsmith commented Jun 23, 2019

#805 has landed! I think this means all our top-level stuff is now statically understandable, and all that's left are a few classes with dynamic attributes.

Can any of our static analysis experts here give us an update on what you think the next step should be?

@jmfrank63
Copy link
Contributor

@njsmith Just browsed again through the thread, but could not make out any task left. _abc.py seems to have been converted already as well as _ssl.py. Python is a great library, I would like to clean up if anything is left here.
Otherwise I think I am turning to the docs and try to write some meaningful examples, looks like there are plenty of docs issues out there. I am not strong at this currently but I guess I'll grow taking on this. If you could point me to something that is either burning or considered a good issue I'd be glad.

@njsmith
Copy link
Member

njsmith commented Nov 1, 2019

@jmfrank63 I think the next steps on static analysis stuff are:

  • Getting mypy running on trio (and then incrementally adding type annotations). Current status is in [WIP] First pass of mypy-cleanliness #1254

  • Using static code generation to replace the dynamic magic in _socket.py

  • Using static code generation to replace the dynamic magic in trio.Path and trio.wrap_file

@sarnold
Copy link

sarnold commented Feb 7, 2021

Just leaving a note here because I was trying to figure out why the urwid tests wouldn't run. The upshot is an attribute error: AttributeError: module 'trio' has no attribute 'hazmat' breaks the tests with 0.18.0 but 0.17.0 works fine.

@njsmith
Copy link
Member

njsmith commented Feb 9, 2021

@sarnold I don't think that's related to the static analysis that this thread is discussion -- it's just that trio.hazmat got renamed to trio.lowlevel back in v0.15, and in v0.18 we removed the backcompat support. v0.15/v0.16/v0.17 gave a deprecation warning everytime you accessed trio.hazmat, so I guess the urwid devs didn't notice the warning? Running tests with -Werror or similar can help with that.

@pquentin
Copy link
Member

pquentin commented Feb 9, 2021

(And there's even a urwid PR: urwid/urwid#439)

@ntamas
Copy link

ntamas commented Feb 9, 2021

The Trio loop was contributed to urwid by me; I guess that the urwid devs don't use Trio regularly. I have sent a PR months ago but it was not merged at the time of writing.

@takluyver
Copy link
Contributor

I've just run into the distinction between SocketType and _SocketType - there seems to be no way to use it as a type annotation so that PyCharm is happy.

Is the distinction just to prevent people constructing SocketType()? Would something like this be acceptable:

class _SocketType(SocketType):
    def __init__(self, sock, _internal=False):
        if not _internal:
            raise TypeError(...)

Of course this makes it easier for someone to do what they're not meant to, but hopefully it's still clear that that's 'off label' usage and not to be relied on.

@njsmith
Copy link
Member

njsmith commented Jun 16, 2021

Is the distinction just to prevent people constructing SocketType()? Would something like this be acceptable:

No, we actually have a more standard solution for that. IIRC the weird SocketType/_SocketType contortions were to support our built-in network "monkeypatching" system: https://trio.readthedocs.io/en/stable/reference-testing.html#trio.abc.SocketFactory

So the goals were:

  • Let people define their own fake "socket classes"
  • Make them behaviorally match the real socket class, so e.g. isinstance(fake_socket, trio.socket.SocketType) should return True
  • Don't let people inherit implementation from our real socket class, because that's both unhelpful for the intended testing use cases, and also asking for backcompat problems.
  • Don't force people to implement the full socket API in their fake socket classes, because it's extremely large and complex and your average test only needs to fake like 10% of it. (This is why I didn't make SocketType an ABC with the whole API defined on it. Well, that + it's annoying to maintain two copies of all those method signatures.)

On further thought, maybe an alternative would be:

  • Rename _SocketType to SocketType, so it's the real class, using NoPublicConstructor to prevent actual subclassing and keep the constructor private
  • Tell people to use ABCMeta.register on SocketType to fake isinstance into working without inheriting any implementation.

I guess this would need a deprecation b/c it would break any existing code that's doing class MyFakeSocket(trio.socket.SocketType), but the fix is trivial and I doubt there are many people doing that, so it could probably be a pretty quick transition.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests