Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

treewide: drop -l$NIX_BUILD_CORES #192447

Merged
merged 1 commit into from
Sep 22, 2022
Merged

Conversation

grahamc
Copy link
Member

@grahamc grahamc commented Sep 22, 2022

Passing -l$NIX_BUILD_CORES improperly limits the overall system load.

For a build machine which is configured to run $B builds where each build gets total cores / B cores ($C), passing -l $C to make will improperly limit the load to $C instead of $B * $C.

This effect becomes quite pronounced on machines with 80 cores, with 40 simultaneous builds and a cores limit of 2. On a machine with this configuration, Nix will run 40 builds and make will limit the overall system load to approximately 2. A build machine with this many cores can happily run with a load approaching 80.

A non-solution is to oversubscribe the machine, by picking a larger $C. However, there is no way to divide the number of cores in a way which fairly subdivides the available cores when $B is greater than 1.

There has been exploration of passing a jobserver in to the sandbox, or sharing a jobserver between all the builds. This is one option, but relatively complicated and only supports make. Lots of other software uses its own implementation of -j and doesn't support either -l or the Make jobserver.

For the case of an interactive user machine, the user should limit overall system load using $B, $C, and optionally systemd's cpu/network/io limiting features.

Making this change should significantly improve the utilization of our build farm, and improve the throughput of Hydra.

Description of changes
Things done
  • Built on platform(s)
    • x86_64-linux
    • aarch64-linux
    • x86_64-darwin
    • aarch64-darwin
  • For non-Linux: Is sandbox = true set in nix.conf? (See Nix manual)
  • Tested, as applicable:
  • Tested compilation of all packages that depend on this change using nix-shell -p nixpkgs-review --run "nixpkgs-review rev HEAD". Note: all changes have to be committed, also see nixpkgs-review usage
  • Tested basic functionality of all binary files (usually in ./result/bin/)
  • 22.11 Release Notes (or backporting 22.05 Release notes)
    • (Package updates) Added a release notes entry if the change is major or breaking
    • (Module updates) Added a release notes entry if the change is significant
    • (Module addition) Added a release notes entry if adding a new NixOS module
    • (Release notes changes) Ran nixos/doc/manual/md-to-db.sh to update generated release notes
  • Fits CONTRIBUTING.md.

@github-actions github-actions bot added 6.topic: python 6.topic: stdenv Standard environment 6.topic: TeX Issues regarding texlive and TeX in general labels Sep 22, 2022
@lheckemann
Copy link
Member

(previously: #174473)

@vcunat
Copy link
Member

vcunat commented Sep 22, 2022

I'm not a fan of this, but OK I hope. It does solve a real problem with how Hydra.nixos.org builders are set up.

It probably makes things worse for people who want(ed) to combine small and big builds on the same machine "fast", with many cores. (--max-jobs $(nproc) --cores $(nproc))

After merging this, we could try to rethink some other approach, possibly based on PR #184886 Also we might see more real-life feedback in the meantime.

@grahamc
Copy link
Member Author

grahamc commented Sep 22, 2022

More robust designs for load limiting is probably a good thing to explore. Let's go ahead and merge this once ofborg is green, and if it causes problems we can revert -- no sweat.

Copy link
Contributor

@jonringer jonringer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the proper path forward would be to introduce a NIX_MAX_CORES or some similar option which would default to $JOBS * $CORES, as there are many cases where you want run many jobs as most builds have long single threaded sections, but it can be really detrimental to have many jobs suddenly spike in thread + RAM usage.

Passing `-l$NIX_BUILD_CORES` improperly limits the overall system load.

For a build machine which is configured to run `$B` builds where each
build gets `total cores / B` cores (`$C`), passing `-l $C` to make will
improperly limit the load to `$C` instead of `$B * $C`.

This effect becomes quite pronounced on machines with 80 cores, with
40 simultaneous builds and a cores limit of 2. On a machine with this
configuration, Nix will run 40 builds and make will limit the overall
system load to approximately 2. A build machine with this many cores
can happily run with a load approaching 80.

A non-solution is to oversubscribe the machine, by picking a larger
`$C`. However, there is no way to divide the number of cores in a way
which fairly subdivides the available cores when `$B` is greater than
1.

There has been exploration of passing a jobserver in to the sandbox,
or sharing a jobserver between all the builds. This is one option, but
relatively complicated and only supports make. Lots of other software
uses its own implementation of `-j` and doesn't support either `-l` or
the Make jobserver.

For the case of an interactive user machine, the user should limit
overall system load using `$B`, `$C`, and optionally systemd's
cpu/network/io limiting features.

Making this change should significantly improve the utilization of our
build farm, and improve the throughput of Hydra.
@grahamc grahamc merged commit 1379da1 into NixOS:staging Sep 22, 2022
@grahamc grahamc deleted the drop-l branch September 22, 2022 23:06
@grahamc
Copy link
Member Author

grahamc commented Sep 22, 2022

If this causes problems, let's consider a revert and see what happens :).

@vcunat
Copy link
Member

vcunat commented Sep 24, 2022

This was pushed with the IMO wrong description about getting system load to value 2, but not worth changing history now.

Anyway, I materialized my ideas about an imperfect solution into PR #192799
(doing better than limiting by load seems hard)

@jonringer
Copy link
Contributor

I created an issue NixOS/nix#7091 for getting a longer term solution in nix

@vcunat
Copy link
Member

vcunat commented Oct 2, 2022

My suggestion isn't getting support so far, if I understand it correctly, but note that on NixOS the default nix configuration for the local machine is setting both to $(nproc) and thus the overall limit is quadratic, so the current combination on nixpkgs master seems rather bad for large machines.

Details: the default nix.conf will contain

cores = 0
max-jobs = auto

@ElvishJerricco
Copy link
Contributor

Much of what I do with NixOS involves frequently performing fairly massive rebuilds on my workstation, and this change has made that substantially more unpleasant. I don't know what the solution is but I wanted to chime in to say that this is a significant problem for my personal workflow.

@lheckemann
Copy link
Member

FWIW @edolstra has some work in progress on cgroup support in Nix. This may help with this kind of resource control problem. In the meantime, @ElvishJerricco maybe setting cores and max-jobs to the square root of the number of cores you have, or a little more, is a reasonable compromise?

@ElvishJerricco
Copy link
Contributor

ElvishJerricco commented Nov 18, 2022

@lheckemann I do not think that would be a reasonable utilization of my hardware (EDIT: for reference, 16c/32t, so high enough that I can get a lot of parallelism out of it, though not nearly as high as many build servers). For instance, one thing I find myself doing often is rebuilding NixOS with a new variation of systemd. Such builds come with long periods where a single derivation could be using all my cores, and long periods where there are many derivations building at the same time. There is no single cores and max-jobs combination that will allow me to fully utilize my hardware for these builds without make -l $N.

Similarly, cgroups would not help, as I understand it. They can't stop make from spawning new processes, and each of those uses memory; enough that yesterday my 64GB desktop OOM'd and killed my builds multiple times (while my desktop was going completely unresponsive). Cgroups could be used to limit the memory usage of a build, but that will result in failures instead of builds simply choosing to use less memory by spawning fewer processes.

@SuperSandro2000
Copy link
Member

I think the current build system with cores and max-jobs is just not smart enough that you can always utilize most of your hardware. Right now you need to optimize for either cores or max-jobs.

@ck3d
Copy link
Contributor

ck3d commented Nov 18, 2022

We could control make and ninja with a central jobserver as following working proof of concept shows:

@ElvishJerricco
Copy link
Contributor

@SuperSandro2000 All I can say is that the workloads I'm talking about with nixpkgs did utilize my hardware very well before this change, and now it is not possible to reach a balance. I do not know what the solution is; I just know it's now a lot worse for me.

@trofi
Copy link
Contributor

trofi commented Nov 18, 2022

I found the opposite: previous -l behaviour frequently prevented processes from spawning too aggressively because loadavg as a metric is laggy to report load on the system. One second I had 90 processes running, another i had 8. I have 16-CPU machine and am using --max-jobs 4 --cores 16 to get a reasonable utilisation. Ideally I want 16 parallel running processes, but I can leave with 64 worst case.

I find system end-to-end builds faster without -l.

@ElvishJerricco
Copy link
Contributor

ElvishJerricco commented Nov 18, 2022

@trofi I've measured it and it is slightly faster without -l... if it doesn't crash. I get a lot of crashes now though, even with 64GB of RAM. And that's not to speak of how unresponsive the much higher load makes my desktop environment.

@trofi
Copy link
Contributor

trofi commented Nov 18, 2022

Yeah, that makes sense. My RAM/core ratio is 8GB/core (+6GB/core of zram just in case). I would guess 4GB/core should be fine for most packages provided /tmp is not taking RAM.

@nixos-discourse
Copy link

This pull request has been mentioned on NixOS Discourse. There might be relevant details there:

https://discourse.nixos.org/t/nix-build-ate-my-ram/35752/8

@emilazy
Copy link
Member

emilazy commented Jul 21, 2024

Linking #328677 for those subscribed to this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.