Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

per-binary opt-in of pyc doesn't function correctly #2212

Open
rickeylev opened this issue Sep 10, 2024 · 1 comment
Open

per-binary opt-in of pyc doesn't function correctly #2212

rickeylev opened this issue Sep 10, 2024 · 1 comment
Assignees

Comments

@rickeylev
Copy link
Contributor

rickeylev commented Sep 10, 2024

🐞 bug report

Affected Rule

py_binary/py_test

Description

The intent of the py_binary.pyc_collection attribute is to allow per-target opt-in for using pyc files from dependencies without specifying command line build flags. This allows for a couple use cases:

  • Build tools can have precompiling enabled (separate from the requested target)
  • Gradual enabling of precompiling can be done (on a per-target basis) without having to figure out where and how a user is setting build flags (which isn't always possible, e.g. during the dev edit-run cycle).

Right now, what happens is, when py_binary.pyc_collection=include_pyc, pyc files from transitive targets are collected, but only if they opted into precompiling with their target-level precompile attribute. When the vast majority of targets don't set that, the net effect is no pyc files are collected. This happens because pyc file generation is conditional on the precompile settings, e.g. if its not enabled, no action generates the file.

To fix, the basic thing that needs to be done is always generate a pyc file and always include it PyInfo.pyc_files. Then, when the binary looks at the transitive pyc files, it'll find them all.

A caveat is: don't add the pyc files to runfiles unless the target opted into precompiling. Doing so would cause all downstream targets to have the pyc, as opposed to only the binaries that want it. It's OK to put the pyc in default outputs (the default outputs of deps aren't included).

@rickeylev rickeylev self-assigned this Sep 10, 2024
@rickeylev
Copy link
Contributor Author

rickeylev commented Sep 17, 2024

Argh. Ok, this turned into a bit of tangled mess. I think I'm going to need to back out a few changes and, within, Google, we'll just have to carry a couple patches.

The problem is that we want the binary-level pyc_collection attribute to have control over whether pyc files end up in the binary. However, the --precompile flag ends up having precedence because it causes libraries to put pyc files into runfiles. This means a binary can't opt-out if --precompile=enabled is the default. So now we have the awkward situation of where a binary can't opt-out with --precompile=enabled is the default and a binary can't opt-in because it requires setting --precompile=enabled (thus preventing opt out).

The general behavior we want is:

  • --precompile=enabled, attr.pyc_collection=disabled => Don't include pyc files
  • --precompile=enabled, attr.pyc_collection=include_pyc => Include pyc files
  • --precompile=disabled, attr.pyc_collection=disabled => Don't include pyc files
  • --precompile=disabled, attr.pyc_collection=include_pyc => Include pyc files

This should be fairly easy to make work in rules_python -- just don't put pyc files in runfiles. This is pretty new behavior, so should be easy to revert.

There are two buts to the above: (1) target-level settings, and (2) existing behavior within google.

(1) target-level settings: It'd be best if a library with precompile=enabled had its pyc files included even for a binary with pyc_collection=disabled. This is because it allows a more granular opt-in of particular targets within the graph, rather than having to take "all or nothing" at the binary level. Making this work requires adding additional PyInfo fields to distinguish between "required" py/pyc files (those generated for a library with precompile=enabled) and "opt-in" py/pyc files (those generated implicitly and later found via pyc_collection). I'm not sure I'll get around to doing this part, but I do really want to allow per-library level settings (precompiling external deps seems like a natural win). I could "split the difference" and do something like "add pyc to runfiles if target-level enabled", but given the headache putting pyc in runfiles has caused, I'm not sure that'll be a net win.

(2) Within Google, we've been putting pyc files in runfiles for many many years, so it'll probably be more involved to stop doing that. I'll have to look at the code paths once I have a prototype going, but we'll probably just carry a patch to add this behavior on top of rules_python.

In some internal conversations, it was also pointed out that libraries putting py and/or pyc files in runfiles at all is problematic in general, which I agree with. Basically because it causes the sitation i'm posting about. A binary loses the ability to control the what/why/how of "code" artifacts going into the final output.

So, ultimately, i think we'll also want to add a flag to stop libraries from putting py files into runfiles, too. Actually, come to think of it, this might unlock some more options for a site-packages based layout? Since the binary would have a bit more control over where things end up.

Anyways, it looks like the net changes I need to make are:

  • Remove --precompile_add_to_runfiles
  • Change py_library et al to not put generated pyc in runfiles; only put in PyInfo
  • Always register a pyc creating action, where the file is put depends on the target-level settings
  • Create PyInfo.optional_pyc_files; this holds the implicitly created pyc files and is what pyc_collection collects.
  • For target-level precompile=enabled, put the pyc files in PyInfo.pyc_files; this acts as the "required" pyc files. This are always included in the binary.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant