Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add forc BuildPlan. Change GitHub dependencies to general git support. Add Forc.lock for more deterministic builds. #825

Merged
merged 27 commits into from
Mar 4, 2022

Conversation

mitchmindtree
Copy link
Contributor

@mitchmindtree mitchmindtree commented Feb 22, 2022

OK, this should be ready for review!

Apologies for the massive PR! Originally I only intended to refactor forc_build so that the process of fetching dependencies and determining the package graph could be distinct from the process of compilation (see #798), however upon beginning I figured it made sense to also address #829 (remove GitHub-only git repo requirement) as the process of building the package graph is tightly coupled with the fetching of those packages. This lead to addressing #746 (adding proper namespace for forc's git checkouts) as the existing directory structure didn't make sense for more generalised git support. In order to allow for safely caching/re-using pre-checked-out builds, I realised I needed the ability to pin specific commits, so I also ended up addressing #799 as a result. In hindsight, I would have been better off splitting off #829 into a future PR 😅 but it's done now.

Edit: Keep in mind about ~2K of the additions are just the new Forc.lock files.


BuildPlan (#798)

The new BuildPlan type is an internal detail that represents a forc project's entire build plan. This includes the package graph, a map from all packages to a local copy of their src, and the compilation order determined by performing a toposort of the package graph.

The package Graph

The package graph represents the dependencies between all packages within the project. Each node represents a pinned package. The edge a -> b implies that a depends on b.

Pinning, Lock & Forc.lock (#799)

Pinning a dependency means fetching its source, determining the latest commit for the given branch (or the latest version that adheres to semver in the case of upcoming registry support) and then ensuring that we can deterministically make use of this same commit or version within future builds.

To achieve this, this PR introduces the Forc.lock file (directly inspired by Cargo.lock) along with a Lock structured representation. The Lock type is a toml-serialization-friendly version of the package graph that, when serialized to TOML and written to Forc.lock, acts as a human-readable, git-diff-friendly representation of the project's graph of pinned dependencies.

By using a lock file to represent pinned dependencies, we can allow for better reproducibility in forc projects without requiring that users write the exact pinned version or commit hashes manually. Instead, users can run forc update to update the commit (or version) for each pinned dependency automatically.

forc build will automatically create a new Forc.lock file if one didn't exist, or if the existing file doesn't match the current set of dependencies in the Forc.toml manifest. Both forc update and forc build will print a diff of the Lock when updating the Forc.lock file. This involves first printing removed packages in red, and added packages in green.

GitHub -> git (#829, #746)

This PR removes all the GitHub-specific code under handling of git dependencies in favour of more general git support using the git2 crate.

Previously, forc used the version field for git dependencies to imply a GitHub "release". The GitHub REST API was used to fetch this release. This PR removes this behaviour, as version doesn't really have any particular meaning under git itself. Instead, this adds support for pinned git branches and git tags.

Unfortunately, git2 does depend on openssl cc @adlerjohn. It looks like there's some recent interest in landing a rustls+hyper backend discussed here along with a WIP that could potentially resolve this dependency in the future.

E2E tests & examples

For now, examples and E2E tests have been updated to use git tags instead. However, following this PR we should consider updating all examples and E2E tests to point to master and instead commit the Forc.lock file to pin each project's versions.

One issue that updating the tests and examples has revealed is that currently, the std dependency doesn't actually specify a pinned version (or tag) for core. As a result, all E2E tests and examples that use some version or tag instead of the master branch for the core dependency end up with two versions of core in their Forc.lock file: 1. the core master branch as defaulted to in the std library's dependencies and 2. the core version listed under the top-level project's dependencies. Fortunately this doesn't seem to cause any issues in most examples. Those few that did have issues have been updated.

Closes #746.
Closes #798.
Closes #799.
Closes #829.


TODO

Follow-up

  • Open issue for forc build --plan-only flag or similar that outputs the pkg_graph via JSON or DOT. For now, downstream can always reconstruct the graph from the Forc.lock, but it's not particularly convenient.
  • Add ability to update only specific packages throughout graph with forc update --package <name>.

mitchmindtree added a commit that referenced this pull request Feb 22, 2022
While working on #825 I noticed that that the `dependency_graph` is
unused throughout `sway-core`. I think the reason why cargo doesn't emit
any warnings about this is that at one point, it's stored in the public
`TypeCheckArgument` struct, so cargo believes that it is intentionally
being re-exported, however in reality we don't use it anywhere else in
the fuel/sway/forc ecosystem.
mitchmindtree added a commit that referenced this pull request Feb 22, 2022
While working on #825 I noticed that that the `dependency_graph` is
unused throughout `sway-core`. I think the reason why cargo doesn't emit
any warnings about this is that at one point, it's stored in the public
`TypeCheckArgument` struct, so cargo believes that it is intentionally
being re-exported, however in reality we don't use it anywhere else in
the fuel/sway/forc ecosystem.
@mitchmindtree mitchmindtree force-pushed the mitchmindtree/forc-build-plan branch 2 times, most recently from 8bef3c2 to 95a68b5 Compare February 22, 2022 10:26
@mitchmindtree mitchmindtree self-assigned this Feb 22, 2022
mitchmindtree added a commit that referenced this pull request Feb 22, 2022
* Remove the unused `&mut dependency_graph` throughout `sway-core`

While working on #825 I noticed that that the `dependency_graph` is
unused throughout `sway-core`. I think the reason why cargo doesn't emit
any warnings about this is that at one point, it's stored in the public
`TypeCheckArgument` struct, so cargo believes that it is intentionally
being re-exported, however in reality we don't use it anywhere else in
the fuel/sway/forc ecosystem.

* Remove trailing commas uncaught by rustfmt

* Remove unused `dependency_graph` from sway-core selector_debug feature
@mitchmindtree mitchmindtree force-pushed the mitchmindtree/forc-build-plan branch 2 times, most recently from 13f2cf4 to fecb92e Compare February 22, 2022 23:29
mitchmindtree added a commit that referenced this pull request Feb 23, 2022
This is in preparation for addressing #799 and makes landing #825 a
little easier for the reasons described in #829.

We might wish to consider keeping these dependencies pointed at the
master branch, however this would be made easier by #830.
mitchmindtree added a commit that referenced this pull request Feb 23, 2022
This is in preparation for addressing #799 and makes landing #825 a
little easier for the reasons described in #829.

We might wish to consider keeping these dependencies pointed at the
master branch, however this would be made easier by #830.
forc/src/pkg.rs Outdated Show resolved Hide resolved
forc/src/lock.rs Outdated Show resolved Hide resolved
forc/src/lock.rs Outdated Show resolved Hide resolved
forc/src/pkg.rs Show resolved Hide resolved
forc/src/pkg.rs Outdated Show resolved Hide resolved
@mitchmindtree mitchmindtree force-pushed the mitchmindtree/forc-build-plan branch 3 times, most recently from 05be535 to 806fb5c Compare February 28, 2022 09:25
@mitchmindtree mitchmindtree changed the title Construct pkg_graph at beginning of forc build Add forc BuildPlan. Change GitHub dependencies to general git support. Add Forc.lock for more deterministic builds. Feb 28, 2022
@mitchmindtree mitchmindtree force-pushed the mitchmindtree/forc-build-plan branch 7 times, most recently from 8bd7ebd to d896d0c Compare March 2, 2022 06:33
@mitchmindtree mitchmindtree marked this pull request as ready for review March 2, 2022 06:39
@mitchmindtree mitchmindtree requested a review from sezna March 2, 2022 06:39
Addresses #829.

Also allows for putting off #831 while we discuss #830.
None of the `std` releases so far pin `core` to a version. As a result,
it pulls the `core` master branch by default, despite the top-level
project also dependending on `core` at a specific version. Most E2E
tests seemt to work regardless, but this subset requires updating.
Following the lock file work landing, we should update versioned `std`
releases to refer to versioned `core` via git tag.
Previously if any of the pkgs that exist in `Forc.lock` were not already
available in `~/.forc/git/checkouts`, we would update the entire lock
file and re-fetch all dependencies. Now, if the local copy of a pkg is
missing, we just fetch that individual pkg and continue.

This commit also updates the `fetch_pkgs` code to re-use pre-fetched git
commits. Previously, all packages would be re-fetched when the
`Forc.lock` file was updated. Now, we only fetch if the path for that
commit doesn't already exist.
Previously one namespace was used to collect *all* items. This meant
that when passed to the package that was being compiled, the namespace
contained not only the names for its dependencies, but sometimes other
unrelated names that happened to be earlier in the compilation order.

This commit fixes this issue, creating a fresh namespace purely for each
package in order to collect only its dependencies. This fixes some
shadowing errors that were appearing in the E2E tests.
As a follow-up, we should update all of the tests to depend on `master`
and then update the lock files. This will save us from having to change
versions tags to pin all the time.

Note: You can clear all test lock files with

```
rm ./test/src/e2e_vm_tests/test_programs/*/Forc.lock
```
We no longer use the GitHub REST API and in turn no longer map the
`version` field to GitHub "release"s. Instead, we now support git more
generally. Warn about appearances of the `version` field in dependencies
and suggest using `branch` or `tag` as an alternative.
@mitchmindtree
Copy link
Contributor Author

OK, just rebased onto master again now that #860 has landed.

@JoshuaBatty JoshuaBatty self-requested a review March 3, 2022 04:25
@mitchmindtree mitchmindtree requested review from JoshuaBatty and removed request for JoshuaBatty March 3, 2022 04:55
Copy link
Member

@JoshuaBatty JoshuaBatty left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Epic!! LGTM but i'll let some others chime in as it's a pretty big one.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Archived in project
3 participants