Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Speedup isolated environment creation #11257

Merged
merged 1 commit into from
Jul 15, 2022
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 3 additions & 0 deletions news/11257.feature.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
Significantly speed up isolated environment creation, by using the same
sources for pip instead of creating a standalone installation for each
environment.
48 changes: 33 additions & 15 deletions src/pip/_internal/build_env.py
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,6 @@
import pathlib
import sys
import textwrap
import zipfile
from collections import OrderedDict
from sysconfig import get_paths
from types import TracebackType
Expand All @@ -29,6 +28,29 @@

logger = logging.getLogger(__name__)

PIP_RUNNER = """
import importlib.util
import os
import runpy
import sys


class PipImportRedirectingFinder:

@classmethod
def find_spec(cls, fullname, path=None, target=None):
if not fullname.startswith("pip."):
return None

# Import pip from the current source directory
location = os.path.join({source!r}, *fullname.split("."))
return importlib.util.spec_from_file_location(fullname, location)


sys.meta_path.insert(0, PipImportRedirectingFinder())
runpy.run_module("pip", run_name="__main__")
"""


class _Prefix:
def __init__(self, path: str) -> None:
Expand All @@ -42,29 +64,25 @@ def __init__(self, path: str) -> None:


@contextlib.contextmanager
def _create_standalone_pip() -> Generator[str, None, None]:
"""Create a "standalone pip" zip file.
def _create_runnable_pip() -> Generator[str, None, None]:
"""Create a "pip runner" file.

The zip file's content is identical to the currently-running pip.
The runner file ensures that import for pip happens using the currently-running pip.
It will be used to install requirements into the build environment.
"""
source = pathlib.Path(pip_location).resolve().parent

# Return the current instance if `source` is not a directory. We can't build
# a zip from this, and it likely means the instance is already standalone.
# Return the current instance if `source` is not a directory. It likely
# means that this copy of pip is already standalone.
Copy link
Member

@pfmoore pfmoore Jul 15, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With the runner, pip_location will point to the current source. So this test won't work as expected, and we'll create a second runner hook. Can we do something by looking at pip.__spec__?

In practice, I don't think having multiple runner hooks is a big issue, so I'm OK if we don't worry too much (although maybe note that we've ignored the issue in a comment, to help people looking at this code in future 😉)

Copy link
Member Author

@pradyunsg pradyunsg Jul 15, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well... this check is still useful for the zipapp case.

I'd prefer to defer the reuse of the runner script for a follow-up though. We should be able to make the runner script into a module and call that from the sources directly (I consider inlining code in a string of a module to be a code smell) -- avoiding the need to create a runner script on environment creation in the first place. :)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've filed #11262, which does this FWIW.

if not source.is_dir():
yield str(source)
return

with TempDirectory(kind="standalone-pip") as tmp_dir:
pip_zip = os.path.join(tmp_dir.path, "__env_pip__.zip")
kwargs = {}
if sys.version_info >= (3, 8):
kwargs["strict_timestamps"] = False
with zipfile.ZipFile(pip_zip, "w", **kwargs) as zf:
for child in source.rglob("*"):
zf.write(child, child.relative_to(source.parent).as_posix())
yield os.path.join(pip_zip, "pip")
pip_runner = os.path.join(tmp_dir.path, "__pip-runner__.py")
with open(pip_runner, "w", encoding="utf8") as f:
f.write(PIP_RUNNER.format(source=os.fsdecode(source)))
yield pip_runner
pradyunsg marked this conversation as resolved.
Show resolved Hide resolved


class BuildEnvironment:
Expand Down Expand Up @@ -206,7 +224,7 @@ def install_requirements(
if not requirements:
return
with contextlib.ExitStack() as ctx:
pip_runnable = ctx.enter_context(_create_standalone_pip())
pip_runnable = ctx.enter_context(_create_runnable_pip())
self._install_requirements(
pip_runnable,
finder,
Expand Down