Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

doc(pkg): Explain package management #10950

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

Leonidas-from-XIV
Copy link
Collaborator

The explanation part of dune package management, split out following the feedback of #10920, aimed at users who want to have a deeper understanding on how the system works under the hood.

* Automatic package repository updates
* Easily reproducible dependencies
* All package dependencies declared in a single file that is kept in-sync
* Per-project dependencies
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see what's new in that list from what opam can already do (apart from automatic package repository updates, which can be seen as quite bad from a reproducibility point of view if you don't have a lockfile).

Instead of comparing opam vs. dune pkg, I would stress the design principle. A random list of stuff:

  • making the OCaml environment setup trivial;
  • promoting a lockfile-first approach to address reproducibility use cases;
  • unifying the configuration files (and removing hidden global states managed by a CLI tool);
  • improving cross-packages and vendoring workflows;
    and much more.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe the central change could be explained separately from the consequences of that change. Say:

The central change in dune's package management is the idea that all information necessary to build a repository lives in the repository, not in unversioned state like opam switches, whether global or repo-local. This is on par with what happens in other language ecosystems, and has the following beneficial properties:

  • excellent support for reproducible builds
  • building a project using this tooling is just dune build, no knowledge necessary
  • etc

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here's a few principles that I've had in mind when designing this feature:

  1. No global state visible to the user
  2. All package configuration must be done via config file updates (dune-project or dune-workspace). No stateful commands such as opam pin.
  3. "Automatic` package repository updates" also applies to pins. For example, we re-fetch branches & tags.
  4. Package builds can only access packages which are listed as dependencies.
  5. Users do not need to learn yet another file format to configure their workspace. Everything should be doable via workspaces.
  6. Build plans are independently versioned so that they can be interpreted in the same way between different versions of dune

packages to install from the `depends` stanza in the `dune-project` file. This
allows projects to completely omit generation of `.opam` files, as long as they
use Dune for package management. Thus all dependencies are only declared in one
file.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A quick remark: if you are not generating .opam files anymore, then your package is no longer pinnable by opam. There's an issue to fix this in the opam tracker (ie. allow on-the-fly generating of opam files).

refer to files. That means, they cannot read the path they are being built or
installed into and expect this path to be stable. Dune builds packages in a
sandbox location and after the build has finished it moves the files to the
actual destination.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe a good place to explain the difference between Dune and Opam sandboxing models?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The main difference is that we build in a different path than the one we install to. As far as I know the fact that we don't wrap things in bwrap/sandbox-exec are not design decisions, just something we haven't implemented yet. It is good idea and shouldn't be too difficult, especially with opam having made sure that packages generally work in its sandbox.

@samoht
Copy link
Member

samoht commented Sep 24, 2024

I've made a quick read and left a few comments - I'm also wondering if we should highlight the Dune cache a bit more here - that's one of the killer features of dune pkg: instant build of most of your dependencies (including the compiler) as they could be re-used between projects.

doc/explanation/package-management.md Outdated Show resolved Hide resolved
doc/explanation/package-management.md Outdated Show resolved Hide resolved
doc/explanation/package-management.md Outdated Show resolved Hide resolved
doc/explanation/package-management.md Outdated Show resolved Hide resolved
doc/explanation/package-management.md Outdated Show resolved Hide resolved
After solving is done, the solution gets written into the lock directory with
all the data that is necessary to build and install the packages. From this
point, the project contains all necessary information and does not need to
access any package repositories.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It might be interesting to have a terminology specification somewhere. When I read the sentence, it seems wrong, as we will later install packages. The distinction between package repositories and the packages we installed should be somewhere in an index, IMO.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am unsure what you consider untrue about it? I've elaborated that section, also noting that it does not download sources, is it better now?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for the delay, as it was marked as outdated I didn't see it. The package repositories can be confusing: we won't access opam-repository for the metadata, but it will need to access the server where you host the package to fetch the tarball. It could be considered as a package repository too. Hence, my question about the terminology ^^

doc/explanation/package-management.md Outdated Show resolved Hide resolved
doc/explanation/package-management.md Show resolved Hide resolved
doc/explanation/package-management.md Show resolved Hide resolved
doc/explanation/package-management.md Show resolved Hide resolved
Signed-off-by: Marek Kubica <marek@tarides.com>
Comment on lines +48 to +50
Since a while Dune has also supported {doc}`opam file generation
</howto/opam-file-generation>` by specifying the package dependencies in the
`dune-project`. Outside of this feature, Dune had not used the `depends` stanza.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's something off with the beginning of this sentence. Do you mean: "For a while, Dune has also supported..." or "Since Dune has supported...."

Comment on lines +207 to +208
refer to files. That means, they cannot read the path they are being built or
installed into and expect this path to be stable. Dune builds packages in a
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
refer to files. That means, they cannot read the path they are being built or
installed into and expect this path to be stable. Dune builds packages in a
refer to files. This means they cannot rely on the path they are being built or installed in, as it may not remain consistent. Dune builds packages in a

Is this still accurate? If so, it improves readability and clarity. If not, let's work on clarifying it together.

Comment on lines +216 to +218
To sidestep these restructions in many cases the solution is to use relative
paths, as Dune guarantees that packages installed into different sections are
installed in a way where their relative location stays the same.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
To sidestep these restructions in many cases the solution is to use relative
paths, as Dune guarantees that packages installed into different sections are
installed in a way where their relative location stays the same.
To work around these restrictions, a common solution is to use relative paths. Dune ensures that packages installed in different sections maintain the same relative location, allowing for more flexibility.

Is this still accurate?

Copy link

@v-gb v-gb Oct 1, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wouldn't say "side step" or "work around" to refer to the expected way of satisfying a condition. Maybe:

To comply with this requirement, the usual solution ...

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the "comply" wording because it makes it sound positive, whereas "work around" and "side step" is a sounds like there is a bug that people have to try to avoid. I would argue that avoiding absolute paths is a good thing in general and makes the code better even without Dune package management.

Copy link
Collaborator

@christinerose christinerose left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few suggestions for potentially clarifying sentences.

installed in a way where their relative location stays the same.

A minor difference is that Dune does not support packages installing themselves
into the standard library, thus being available without having to be declared a
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
into the standard library, thus being available without having to be declared a
into the standard library, so they're available without having to be declared a

Does this work better?

Comment on lines +3 to +7
This document gives an explanation on how the new package management
feature introduced in Dune works under the hood. It requires a bit of
familiarity with how opam repositories work and how Dune builds packages. Thus
it is aimed at people who want to understand how the feature works, not how it
is used.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
This document gives an explanation on how the new package management
feature introduced in Dune works under the hood. It requires a bit of
familiarity with how opam repositories work and how Dune builds packages. Thus
it is aimed at people who want to understand how the feature works, not how it
is used.
This document explains how Dune's package management works
under the hood. It requires a bit of familiarity with how opam
repositories work and how Dune builds packages. Thus it is aimed at people
who want to understand how the feature works, not how it is used.

A bit terser, and without the "new" of "new package management feature", as these qualifiers tend to get stale.

* Automatic package repository updates
* Easily reproducible dependencies
* All package dependencies declared in a single file that is kept in-sync
* Per-project dependencies
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe the central change could be explained separately from the consequences of that change. Say:

The central change in dune's package management is the idea that all information necessary to build a repository lives in the repository, not in unversioned state like opam switches, whether global or repo-local. This is on par with what happens in other language ecosystems, and has the following beneficial properties:

  • excellent support for reproducible builds
  • building a project using this tooling is just dune build, no knowledge necessary
  • etc

Comment on lines +52 to +56
The package management feature changes this, as Dune now determines the list of
packages to install from the `depends` stanza in the `dune-project` file. This
allows projects to completely omit generation of `.opam` files, as long as they
use Dune for package management. Thus all dependencies are only declared in one
file.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
The package management feature changes this, as Dune now determines the list of
packages to install from the `depends` stanza in the `dune-project` file. This
allows projects to completely omit generation of `.opam` files, as long as they
use Dune for package management. Thus all dependencies are only declared in one
file.
Dune with package management instead computes the list of
packages to install from the `depends` stanza in the `dune-project` file. This
allows projects to completely omit generation of `.opam` files, as long as they
use Dune for package management. Thus all dependencies are only declared in one
file.

I don't think the last sentence is right though.
Dependencies are declared both in dune files and in dune-project. And usually the opam file would be generated from the dune-project regardless, so the duplication is not necessarily user-facing.

Comment on lines +58 to +61
For compatibility with a larger amount of existing projects, Dune will also
collect dependencies from `.opam` files in the project. So while recommended,
there is no obligation to switch to declaring dependencies in the
`dune-project`. Likewise the generation of `.opam` files will still work.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems like user-facing doc, doesn't it?

Comment on lines +65 to +75
To go from a project's set of dependency constraints to a set of installed
packages and versions, there needs to be a step to determine the right packages
and their versions to be installed.

In `opam`, this process happens as part of `opam install`, which links finding
a solution that satifies the given constrains and installation into one step.
Dune on the other hand separates the steps of finding a solution and installing.
First a solution is found and then packages are installed.

The idea of finding a solution and recording it for later is popular in other
programming language package managers like NPM and is usually called locking.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
To go from a project's set of dependency constraints to a set of installed
packages and versions, there needs to be a step to determine the right packages
and their versions to be installed.
In `opam`, this process happens as part of `opam install`, which links finding
a solution that satifies the given constrains and installation into one step.
Dune on the other hand separates the steps of finding a solution and installing.
First a solution is found and then packages are installed.
The idea of finding a solution and recording it for later is popular in other
programming language package managers like NPM and is usually called locking.
Given the list of the project's transitive dependencies and their version constraints, the next steps are:
- figure out a version for each dependency that follows the constraints
- for each dependency, download it, build it, and make it available to the project
In `opam`, `opam install` does both of these.
In `dune`, these are separate steps: the first one is `dune pkg lock`, and the second one happens implicitly as part of building. The idea of finding a solution and recording it for later is popular in other programming language package managers like NPM and is usually called locking.

Suggestion for something terser.

Comment on lines +123 to +125
However, it is also possible to specify specific revisions of the repositories,
to get a reproducible solution. Due to using Git, any previous revision of the
repository can be used by specifying a commit hash.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This wording is confusing, I'm not sure what this is saying. Maybe this?

Suggested change
However, it is also possible to specify specific revisions of the repositories,
to get a reproducible solution. Due to using Git, any previous revision of the
repository can be used by specifying a commit hash.
Instead of specifying a package from the metadata repository, it is also possible to specify a git url + git hash.

Or is it talking about metadata repositories, maybe?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here it is in the context of opam-repository so metadata repository or "package repository".

Comment on lines +178 to +179
* Build rules to evaluate the build instructions from the build instructions
stored in the lock directory
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
* Build rules to evaluate the build instructions from the build instructions
stored in the lock directory
* Build rules to execute the build instructions stored in the lock directory

Comment on lines +183 to +186
Creating these processes as rules mean that they will only be executed on
demand, so if the project has already downloaded the sources, it does not
need to download them again. Likewise, if packages are installed, they stay
installed.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Creating these processes as rules mean that they will only be executed on
demand, so if the project has already downloaded the sources, it does not
need to download them again. Likewise, if packages are installed, they stay
installed.
Executing these steps as build rules allow them to be run on-demand and cached, even across projects. So building a fraction of a project requires building only the necessary dependencies. And if two repositories have some dependencies in common, their common dependencies will only be downloaded and built once, not twice.

I think this is true, considering the shared-cache is enabled, right? Because that seems to contradict the next paragraph below.

Comment on lines +193 to +195
When building the users project, the installed packages are added to the
necessary search paths, so user code can use the dependencies without any
additional configuration.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure this is worth saying, that seems clear to me?

Comment on lines +224 to +226
For this reason, the `overlay` repository exists, which contains packages where
the upstream packages are incompatible with Dune package management but were
patched to work in Dune.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
For this reason, the `overlay` repository exists, which contains packages where
the upstream packages are incompatible with Dune package management but were
patched to work in Dune.
The `overlay` repository exists specifically to make such packages
compatible with Dune's package management.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
docs Documentation improvements package management
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants