Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Why is jemalloc linked to rustc by default *only* via CI? #56812

Closed
pnkfelix opened this issue Dec 14, 2018 · 12 comments
Closed

Why is jemalloc linked to rustc by default *only* via CI? #56812

pnkfelix opened this issue Dec 14, 2018 · 12 comments
Labels
A-allocators Area: Custom and system allocators P-medium Medium priority T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. T-infra Relevant to the infrastructure team, which will review and decide on the PR/issue.

Comments

@pnkfelix
Copy link
Member

pnkfelix commented Dec 14, 2018

Spawned off of #56736 (comment)

Executive summary: I want the out of the box experience for a locally-built rustc (assuming one has optimizations on and debug off) to be as close as possible to what you get from the CI-built distribution of rustc.

With respect to jemalloc, the above goals means we should either: 1. turn rustc.jemalloc on by default in the build config, or 2. turn rustc.jemalloc off in the CI (after perhaps evaluating whether the CI can use a newer glibc, which I hear may have performance more comparable to what jemalloc achieves).

However, it may be that others disagree with my overall goal (about out-of-the-box experience), and that there is never any substitute for finding out what the actual configure arguments are for one's platform. (In which case I think we should revise the rustc-guide to more prominently feature how one inspects the .yml files to find the configure arguments for a platform. Yuck.)

More explanation/discussion follows below.


As of PR #55238 (resolving issue #36963), builds of rustc stopped linking in jemalloc by default; however, if I am correctly reading the documentation and commit messages of that PR, the rustc built via CI opts back into having jemalloc linked to rustc (on Linux or Mac OS X). (and thus the nightly you get via rustup or otherwise downloading CI-built executables will link to jemalloc).

Its a pretty confusing situation, in my opinion, since attempts to locally replicate the behavior described here via a local build of rustc would need to turn that flag back on.

(Also: the CI's opting back into using jemalloc affects not just the nightly builds but also the beta and stable ones......?)

On the aforementioned comment thread, @gnzlbg added:

IIUC the intent was for rustc to always depend on jemalloc by default, since that was the status-quo before that change, but to allow people to build it without jemalloc, e.g., if they want to use it in a system where jemalloc is not available. It might be that this did not fully materialize.


So at first my question was: Why do we not just make rustc.jemalloc true by default?

But then I reviewed PR #55238, and I think I might have found the answer to that question:

  • We definitely do not want to link in jemalloc on platforms other than Linux/OSX by default.
  • I had originally thought that the semantics of rustc.jemalloc was that it had no effect unless you were on Linux/OSX.
  • But after reviewing the PR, I am now thinking that while the correctness of code of code like this is dependent on being on either Linux or Mac OS X, that does not mean such code behaves as a no-op on other platforms.
  • Therefore, the way things currently stand, achieving my goal of parity-with-CI-by-default will not accomplished by simply making rustc.jemalloc true by default. I think we would have to also "fix" things so that that flag also behaves as a no-op on platforms other than Linux or Mac OS X.

Having said all that: I want the out of the box experience for a locally-built rustc (assuming one has optimizations on and debug off) to be as close as possible to what the CI gives you. I assume that others see the value in that objective...?

@pnkfelix
Copy link
Member Author

cc @alexcrichton , who I suspect can provide much enlightenment about the engineering concerns here (in terms of whether rustc.jemalloc in fact is a no-op on platforms other than Linux+Mac, and also on what they see as the best path forward to maximize parity between CI and local builds).

@pnkfelix pnkfelix added T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. A-allocators Area: Custom and system allocators I-nominated labels Dec 14, 2018
@pnkfelix
Copy link
Member Author

nominating for discussion at next T-compiler meeting.

@nikic
Copy link
Contributor

nikic commented Dec 14, 2018

My 2c is that the current defaults are reasonable. On CI we want to produce the fastest binaries we can (within our time budget). That means doing stuff like enabling jemalloc, using one codegen unit where possible, building with cross-language lto, etc.

On the other hand, for a local build you can trade of performance of the final artifact for other factors, such as build time, number of required dependencies or the debugging experience. The use of jemalloc in particular precludes the use of valgrind and the associated toolset (such as callgrind, massif or dhat). If those are tools that a rustc developer is likely to want to use, then it's good if they work out of the box (rather than producing a rather impenetrable segfault).

In order to reproduce the exact conditions used by CI, one can run one of the docker images locally, without having to figure out what the exact differences in configuration may be.

@gnzlbg
Copy link
Contributor

gnzlbg commented Dec 14, 2018

One solution could be to make the jemalloc config an "on/off" toggle, and to encode here the default behavior if the toggle is not present in the configuration.

When developing chances are that one is using a custom configuration file anyways, toggling jemalloc off should be easily possible there if your platform happens to enable it by default.


The use of jemalloc in particular precludes the use of valgrind

valgrind is tested in both jemalloc's and jemalloc-sys CI so that should at least work - you might need a recent valgrind version though, but if this doesn't work, please open a bug in jemalloc-sys !

@pnkfelix
Copy link
Member Author

pnkfelix commented Dec 14, 2018

@gnzlbg and @pnkfelix had a brief discussion of this on zulip

I would not be opposed to making rustc.jemalloc effectively a ternary value (which is analogous to what @gnzlbg suggests: Present(true), Present(false), and Absent).

@nikic makes a number of good points in their comment, but in general I still think its a good thing for the default out-of-box experience building rustc to match that of the CI (or at least, any deviation from the CI in the out-of-box experience deserves prominent documentation in the config.toml.example...)

@alexcrichton
Copy link
Member

Oh sorry for the confusion here!

My intention with the jemalloc switcheroo was to turn it off by default in as many locations as possible, as having it on-by-default has just been an unending headache for us. Turning rust.jemalloc on by default as-is would probably break a whole bunch of platforms one way or another, and we'd likely have to encode all sorts of logic to "get it working on all platforms by default".

I think it's a laudable goal to get binaries locally that match CI, but in reality jemalloc I think is the least of our concerns. We've had historical bugs in Rust that stem from glibc differences, binutils differences, etc. Really the only way to match CI is to actually do what CI does, run the docker scripts which pass in all the same configuration and match all tools used during the build.

I would personally prefer to keep jemalloc turned off by default if possible and only enable it on CI myself. Eventually I'd love to remove jemalloc entirely if we can find that glibc is fast enough!

@pnkfelix
Copy link
Member Author

Discussed at T-compiler meeting.

We (or at least I) concluded that it would probably be satisfactory to have more prominent documentation as to the differences between the CI environment vs the out-of-box defaults from config.toml.example.

  • @nikomatsakis did provide an interesting option of perhaps providing multiple config.tomls that correspond to the setup for CI?

In any case, if docs are the answer here, these are what I would like to see:

  1. How to reproducing the process of exactly replicating a CI build (e.g. via docker: but treat the audience as someone who has never used docker, perhaps),
  2. surveying the differences between CI build env and the defaults of config.toml.example, and
  3. documenting the things from 2 in both the rustc guide and the config.toml.example?

@pnkfelix pnkfelix added T-infra Relevant to the infrastructure team, which will review and decide on the PR/issue. and removed I-nominated labels Dec 20, 2018
@pnkfelix
Copy link
Member Author

Now that I have put up the above wishlist for the docs, I'll add that it was pointed out during the T-compiler meeting that perhaps T-infra would be a better team to tackle steps 1 and 2 above. Or at least step 2, in my opinion.

@pnkfelix
Copy link
Member Author

renominating for discussion amongst T-infra members.

@ehuss
Copy link
Contributor

ehuss commented Dec 20, 2018

A while ago I started (but did not finish) documentation for running CI images locally at https://rust-lang.github.io/rustc-guide/tests/intro.html#testing-with-docker-images. I've learned a little more since then, and could fill it out some more, along with some docker basics. But I'm by no means an expert on this, and so any input/help from others would be beneficial.

I think more detail in config.toml.example about the defaults would be good for any setting (like jemalloc) that has non-obvious defaults in standard builds.

There is also a bigger issue of using rustc-the-library, where there are differences between rustc-the-binary. In particular, things like jemalloc (changing in #56986), and various settings (like windows stack sizes — which were curiously not bumped to 32mb like other platforms) are difficult for users (like rls/rustdoc, etc.) to keep in-sync, and is a bit of insider, magical knowledge.

@pietroalbini
Copy link
Member

Discussed this at the infra meeting.

Documenting the changes between CI and config.toml.example would be great, but we don't have a lot of time to do that in the near future though. We'll revisit this in a few months, and it would be awesome if someone wants to write the documentation sooner!

@pietroalbini pietroalbini added the P-medium Medium priority label Mar 12, 2019
@aidanhs
Copy link
Member

aidanhs commented Sep 3, 2019

Discussed in the infra meeting.

Realistically this isn't going to bubble to the top of the priority list and (as pointed out by @Mark-Simulacrum) is likely to drift out of sync with CI anyway - the reference should be the CI scripts.

As such, I'm going to close this issue - sorry!

(we wouldn't object to mentoring someone to make progress here, but I want to be clear that we're probably not going to be pushing it)

@aidanhs aidanhs closed this as completed Sep 3, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-allocators Area: Custom and system allocators P-medium Medium priority T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. T-infra Relevant to the infrastructure team, which will review and decide on the PR/issue.
Projects
None yet
Development

No branches or pull requests

7 participants