Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make local-sandbox output capture symlink oblivious #15211

Conversation

stuhood
Copy link
Sponsor Member

@stuhood stuhood commented Apr 21, 2022

Make local-sandbox output capture symlink oblivious, since we do not support REAPI-native symlinks, and don't intend to.

[ci skip-build-wheels]

…support REAPI-native symlinks.

# Building wheels and fs_util will be skipped. Delete if not intended.
[ci skip-build-wheels]
@stuhood
Copy link
Sponsor Member Author

stuhood commented Apr 21, 2022

Hey folks: minor change with potentially large impacts. Interested in your thoughts on whether supporting capturing REAPI symlinks in local sandboxes will ever make sense.

It seems to me that we could support consuming/materializing REAPI symlinks without ever producing them ourselves in local sandboxes. But that could be annoying from an inability-to-roundtrip-reproducibly perspective.

@compyman
Copy link

Here's an example where having actually symlinks changes the process behavior
I have a node plugin: https://github.com/compyman/pants-node
which runs as separate processes

  1. npm install & 2) npm run pants:build
    in order to use the installed packages in the build the node_modules directory is included as the output_digest of the process. However, the method that npm uses to expose executable files to npm scripts relies on symlinking javascript files from node_modules/.bin/{executable}-> node_modules/{package_name}/bin/{executable}.js and then adding node_modules/.bin to the search path when running a script with npm run
    The symlinked executable files can reference other files in their package - these are all relative references to the location of the file in the package if the symlink is replaced with an actual file those relative references are no longer valid!

Here's a concrete example
node_modules/.bin/nuxt has something like

const suffix = require('../package.json').name.includes('-edge') ? '-edge' : ''

originally the script is a symlink to node_modules/nuxt/bin/nuxt.js so the require call resolves to node_modules/nuxt/package.json, which definitely exists. Once copied into the digest and the symlink is materialized that resolves to node_modules/package.json which does not exist!

let me know if you need any more info about this use case!

compyman added a commit to compyman/pants-node that referenced this pull request May 25, 2022
This PR adds a somewhat major change to how the pants-node plugin sets
up the environment in which it runs npm scripts. The goal is to better
support multiple projects with multiple package.jsons in a single
repo. The main necessary change is that we locate package.json with
the rest of the source files of the node_library instead of special
casing them.

a) Due to an issue with symlinks that was previously undiscovered due
to webpack behavior, we are no longer installing executables to the
`./node_modules/.bin` directory, this means the `'pants:build"` key in
the scripts object will need to call node directly with the path of
the executable script (e.g. `"node
node_modules/nuxt/bin/nuxt.js"` (this is what was symlinked
previously, but does not work with pants if the symlink references
relative paths) [More info
here](pantsbuild/pants#15211 (comment))

b) the previous code relied on package(-lock).json being in root of
the repository, this PR removes special handling of the
package(-lock).json files, they are now regular source files and part
of the `node_library` target

c) When running `npm run-script pants:build` the directory will
include all source files that have been re-rooted _to the node_package
target directory_ this means you can't meaningfully include files that
exist outside of the node_package directory in your build, I... don't
think this is a problem but it is something to keep in mind.

E.g. if before your node_package was defined in
src/delivery/dashboard/BUILD previously this plugin would evaluate
`npm run-script pants:build` in a directory containing package.json &
src/delivery/dsahboard/{all dependent source files & directories}
Now {all source files & directories} will be included in the
root (alongside package.json), this means that hopefully your
pants:build script is simpler/does not require weird configuration to
reach 3 directory layers deep.

d) source roots in output files will be stripped if the output_path is
not defined. This is so the bundles will overlay nicely with the
packaged python code, which generally strips source roots. If
output_path is defined output files will be directly appended to the
output_path value.

There is now a NodeSubsystem (pants terminology for configuration
settings) to allow configuration of how the pants-node plugin searches
for npm. Default configurations  are the same as previously except
/usr/local/bin is not searched by default. for more info run
`./pants help node` to display the full options & format

refrencing the changes above here's recommended process to upgrade
your targets to include these changes.

1) Include package.json & package-lock.json in your node_library
definitions.
2) update your 'pants:build' script key to directly reference the
appropriate executable
3) Make sure the output file path is correct & is included correctly
in the docker image you build

I have tested all this in the remit-srv codebase (that PR upcoming)
and it builds/includes both delivery dashboards successfully! You
truly can have it all.
compyman added a commit to waveremit/pants-node that referenced this pull request May 25, 2022
This PR adds a somewhat major change to how the pants-node plugin sets
up the environment in which it runs npm scripts. The goal is to better
support multiple projects with multiple package.jsons in a single
repo. The main necessary change is that we locate package.json with
the rest of the source files of the node_library instead of special
casing them.

a) Due to an issue with symlinks that was previously undiscovered due
to webpack behavior, we are no longer installing executables to the
`./node_modules/.bin` directory, this means the `'pants:build"` key in
the scripts object will need to call node directly with the path of
the executable script (e.g. `"node
node_modules/nuxt/bin/nuxt.js"` (this is what was symlinked
previously, but does not work with pants if the symlink references
relative paths) [More info
here](pantsbuild/pants#15211 (comment))

b) the previous code relied on package(-lock).json being in root of
the repository, this PR removes special handling of the
package(-lock).json files, they are now regular source files and part
of the `node_library` target

c) When running `npm run-script pants:build` the directory will
include all source files that have been re-rooted _to the node_package
target directory_ this means you can't meaningfully include files that
exist outside of the node_package directory in your build, I... don't
think this is a problem but it is something to keep in mind.

E.g. if before your node_package was defined in
src/delivery/dashboard/BUILD previously this plugin would evaluate
`npm run-script pants:build` in a directory containing package.json &
src/delivery/dsahboard/{all dependent source files & directories}
Now {all source files & directories} will be included in the
root (alongside package.json), this means that hopefully your
pants:build script is simpler/does not require weird configuration to
reach 3 directory layers deep.

d) source roots in output files will be stripped if the output_path is
not defined. This is so the bundles will overlay nicely with the
packaged python code, which generally strips source roots. If
output_path is defined output files will be directly appended to the
output_path value.

There is now a NodeSubsystem (pants terminology for configuration
settings) to allow configuration of how the pants-node plugin searches
for npm. Default configurations  are the same as previously except
/usr/local/bin is not searched by default. for more info run
`./pants help node` to display the full options & format

refrencing the changes above here's recommended process to upgrade
your targets to include these changes.

1) Include package.json & package-lock.json in your node_library
definitions.
2) update your 'pants:build' script key to directly reference the
appropriate executable
3) Make sure the output file path is correct & is included correctly
in the docker image you build

I have tested all this in the remit-srv codebase (that PR upcoming)
and it builds/includes both delivery dashboards successfully! You
truly can have it all.
compyman added a commit to waveremit/pants-node that referenced this pull request May 25, 2022
This PR adds a somewhat major change to how the pants-node plugin sets
up the environment in which it runs npm scripts. The goal is to better
support multiple projects with multiple package.jsons in a single
repo. The main necessary change is that we locate package.json with
the rest of the source files of the node_library instead of special
casing them.

a) Due to an issue with symlinks that was previously undiscovered due
to webpack behavior, we are no longer installing executables to the
`./node_modules/.bin` directory, this means the `'pants:build"` key in
the scripts object will need to call node directly with the path of
the executable script (e.g. `"node
node_modules/nuxt/bin/nuxt.js"` (this is what was symlinked
previously, but does not work with pants if the symlink references
relative paths) [More info
here](pantsbuild/pants#15211 (comment))

b) the previous code relied on package(-lock).json being in root of
the repository, this PR removes special handling of the
package(-lock).json files, they are now regular source files and part
of the `node_library` target

c) When running `npm run-script pants:build` the directory will
include all source files that have been re-rooted _to the node_package
target directory_ this means you can't meaningfully include files that
exist outside of the node_package directory in your build, I... don't
think this is a problem but it is something to keep in mind.

E.g. if before your node_package was defined in
src/delivery/dashboard/BUILD previously this plugin would evaluate
`npm run-script pants:build` in a directory containing package.json &
src/delivery/dsahboard/{all dependent source files & directories}
Now {all source files & directories} will be included in the
root (alongside package.json), this means that hopefully your
pants:build script is simpler/does not require weird configuration to
reach 3 directory layers deep.

d) source roots in output files will be stripped if the output_path is
not defined. This is so the bundles will overlay nicely with the
packaged python code, which generally strips source roots. If
output_path is defined output files will be directly appended to the
output_path value.

There is now a NodeSubsystem (pants terminology for configuration
settings) to allow configuration of how the pants-node plugin searches
for npm. Default configurations  are the same as previously except
/usr/local/bin is not searched by default. for more info run
`./pants help node` to display the full options & format

refrencing the changes above here's recommended process to upgrade
your targets to include these changes.

1) Include package.json & package-lock.json in your node_library
definitions.
2) update your 'pants:build' script key to directly reference the
appropriate executable
3) Make sure the output file path is correct & is included correctly
in the docker image you build

I have tested all this in the remit-srv codebase (that PR upcoming)
and it builds/includes both delivery dashboards successfully! You
truly can have it all.
compyman added a commit to waveremit/pants-node that referenced this pull request May 25, 2022
This PR adds a somewhat major change to how the pants-node plugin sets
up the environment in which it runs npm scripts. The goal is to better
support multiple projects with multiple package.jsons in a single
repo. The main necessary change is that we locate package.json with
the rest of the source files of the node_library instead of special
casing them.

a) Due to an issue with symlinks that was previously undiscovered due
to webpack behavior, we are no longer installing executables to the
`./node_modules/.bin` directory, this means the `'pants:build"` key in
the scripts object will need to call node directly with the path of
the executable script (e.g. `"node
node_modules/nuxt/bin/nuxt.js"` (this is what was symlinked
previously, but does not work with pants if the symlink references
relative paths) [More info
here](pantsbuild/pants#15211 (comment))

b) the previous code relied on package(-lock).json being in root of
the repository, this PR removes special handling of the
package(-lock).json files, they are now regular source files and part
of the `node_library` target

c) When running `npm run-script pants:build` the directory will
include all source files that have been re-rooted _to the node_package
target directory_ this means you can't meaningfully include files that
exist outside of the node_package directory in your build, I... don't
think this is a problem but it is something to keep in mind.

E.g. if before your node_package was defined in
src/delivery/dashboard/BUILD previously this plugin would evaluate
`npm run-script pants:build` in a directory containing package.json &
src/delivery/dsahboard/{all dependent source files & directories}
Now {all source files & directories} will be included in the
root (alongside package.json), this means that hopefully your
pants:build script is simpler/does not require weird configuration to
reach 3 directory layers deep.

d) source roots in output files will be stripped if the output_path is
not defined. This is so the bundles will overlay nicely with the
packaged python code, which generally strips source roots. If
output_path is defined output files will be directly appended to the
output_path value.

There is now a NodeSubsystem (pants terminology for configuration
settings) to allow configuration of how the pants-node plugin searches
for npm. Default configurations  are the same as previously except
/usr/local/bin is not searched by default. for more info run
`./pants help node` to display the full options & format

refrencing the changes above here's recommended process to upgrade
your targets to include these changes.

1) Include package.json & package-lock.json in your node_library
definitions.
2) update your 'pants:build' script key to directly reference the
appropriate executable
3) Make sure the output file path is correct & is included correctly
in the docker image you build

I have tested all this in the remit-srv codebase (that PR upcoming)
and it builds/includes both delivery dashboards successfully! You
truly can have it all.
@stuhood
Copy link
Sponsor Member Author

stuhood commented Jun 9, 2022

Thank you @compyman : that is a great example. I'll close this in favor of actually tackling #10271 at some point.

@stuhood stuhood closed this Jun 9, 2022
@stuhood stuhood deleted the stuhood/local-process-output-captures-is-symlink-oblivious branch June 9, 2022 23:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants