Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use a two step Fortran and C++ dependency scanner #12539

Open
wants to merge 13 commits into
base: master
Choose a base branch
from

Conversation

dcbaker
Copy link
Member

@dcbaker dcbaker commented Nov 22, 2023

This series does a bit of cleanup and optimization as well, but the main goal is to have a scanner that outputs P1689r5 compatible JSON scanning information, and then a second step that reads in multiple JSON files and produces a dyndep.

This has a couple of advantages:

  1. the JSON format is what both MSVC and Clang (through clang-scan-deps) produce for C++ modules, which means that we can re-use the same accumulator for our internal scanner and for using MSVC and clang-scan-deps
  2. by splitting the steps up we provide the JSON to dependees, allowing for more accurate dependency information than we can easily generate at configure time, this allows for greater parallelism

This is really about getting to the point of having reliable C++ module support for both MSVC and Clang using their provided tools (GCC has gone down a very strange path which is different than MSVC, Clang, and Gfortran), but as a clean first step

Copy link
Member

@eli-schwartz eli-schwartz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's a few "add missing type annotations" commits in a row which I think make sense to be folded into one commit.

Have not reviewed the full patch series yet...

mesonbuild/backend/ninjabackend.py Outdated Show resolved Hide resolved
mesonbuild/backend/ninjabackend.py Outdated Show resolved Hide resolved
@dcbaker dcbaker force-pushed the submit/two-step-dep-scanner branch 2 times, most recently from a468b08 to 501ba9e Compare December 21, 2023 21:15
@jpakkane
Copy link
Member

jpakkane commented Jan 1, 2024

The downside of two phase scanning is that you need to invoke two processes per each source file just to do the scanning. This is ... unfortunate. I'm working on a blog post/project that I hope to cause at least some sort of a tooling change. I'd like to get that out before thinking about merging this. Hopefully it only takes a few days.

@jpakkane
Copy link
Member

jpakkane commented Jan 2, 2024

Here it is.

@dcbaker dcbaker force-pushed the submit/two-step-dep-scanner branch from 501ba9e to 8698451 Compare January 3, 2024 00:35
@dcbaker
Copy link
Member Author

dcbaker commented Jan 3, 2024

I've ready both of your blog posts, both this one and the one one about the dynamic command lines. Clang does have the option you want, it's called -fprebuilt-module-paths. I started working on using that with the clean-scan-deps work I'm doing.

Second, Meson's scanner is actually broken, since it relies on hacks that are emitted at configure time (although it looks like the hack hasn't been implemented for C++ yet, and can be easily demonstrated by hacking the module test to put src9.cxx in a static library, then attempting to build gcc/modtest.p/src8.cxx, with my patches here it correctly attempts to generate src9.cxx first for it's ifc file (although with gcc 12 it runs into the "inputs can't also have inputs" problem.)) This is done by making each target linked with a orderonly dependency. I've removed the fortran hack as well in this series. But thinking about it, I'm not sure that this is even sufficient, since you could add another module to an existing source file and we wouldn't properly wait on that, so it actually needs to a full dependency.

Third, To accurately get this information and have the order be correct you must read the module outputs of all of the dependencies of a target and account for those, so the bmi (I'm going to call it that from now on) is up to date. You also need to recalculate your dynamic outputs in the even that. This is also necessary if a new module name is added. You can do this in one step, but the dyndep doesn't actually have enough information for this, so you'd have have a sideband (ie, generate json and dyndep in one step).

Having dep information on a per tu basis has advantages compared to a per target basis, at the cost of simplicity. It means that we can start more parallel compilation, and do less work for incremental builds.

So, we're basically making tradeoffs:

  • with the current one step approach we must make each target have a full dependency on the previous target (currently its order-only, but that isn't sufficient, and I can write a test to show that). This could create a significant bottleneck
  • With the per tu approach in my series we'd run more processes, but we only scan source files when they've actually changed, only accumulate the json data into a dd when the json data has changed, and get accurate, per tu information which allows for more work to be done in parallel. We also do less work on incremental builds, since we only need to scan and rebuild targets that have actually changed.

@dcbaker dcbaker force-pushed the submit/two-step-dep-scanner branch from 8698451 to e3a107e Compare January 3, 2024 00:41
@dcbaker
Copy link
Member Author

dcbaker commented Jan 3, 2024

I've added the necessary orderdep -> full dep changes, but I haven't gotten the tests yet. They'll require a python unittest because we're going to have to compile -> change the source -> recompile a specific target to prove that the bug is fixed, and I'm out of time for today

@dcbaker dcbaker force-pushed the submit/two-step-dep-scanner branch from e3a107e to 3960b72 Compare January 3, 2024 00:53
@dcbaker
Copy link
Member Author

dcbaker commented Jan 3, 2024

And I just realized that there's another case that isn't handled here that has to be handled, which is recursive dependencies, and after a quick test I found that that too is broken.

@jpakkane
Copy link
Member

jpakkane commented Jan 3, 2024

Clang does have the option you want, it's called -fprebuilt-module-paths. I started working on using that with the clean-scan-deps work I'm doing.

Yes, but does it have a compiler flag for "write module files, whatever their name, in directory X"? All the examples that I see do not have that and use explicit output file it is needed to avoid having to generate command line arguments during compilation. That is the biggest problem FWICT, the others we can fix and/or work around.

Second, Meson's scanner is actually broken,

I would have been amazed if it were not. :D I only implemented enough of it to get things going. Sadly they have been stagnant for a few years.

with the current one step approach we must make each target have a full dependency on the previous target

Assuming target A that depends on B, we only need to add a dependency of type "all compilations of A must wait until all compilations of B are done". You'd need a pseudo target for this but it's doable. The delay would be at most the longest compilation in A on average a fraction of that. Is that too much? Maybe. Maybe not. I'm hesitant to throw around performance estimates without actual measurements.

@dcbaker
Copy link
Member Author

dcbaker commented Jan 3, 2024

Yes, but does it have a compiler flag for "write module files, whatever their name, in directory X"? All the examples that I see do not have that and use explicit output file it is needed to avoid having to generate command line arguments during compilation. That is the biggest problem FWICT, the others we can fix and/or work around.

I just did

echo "export module speech; export const char * say() { return "Hello, World!"l; }" > "speech.cppm"
mkdir priv
clang++ -std=c++20 -fmodule-output -c speech.cppm -o priv
ls priv

and it wrote out a test.o and a speech.pcm. Which is the sanest implementation of the big 3 (with GCC's plan being absolutely bonkers). We could pretty easily have a rule of "if the name of your file == the name of your exported module you're good to go, otherwise you must pass the cpp_module_name parameter and Meson will map that to -fmodule-output=priv/<cpp_module_name>.pcm, which is not perfect, but makes the common case obvious (foo.cpp has the module foo) and isn't too bad in the less common case.

This is much closer to what we actually want than GCC's server plan, and MSVC which requires you to declare whether a source file exports an interface or a partition. AFAICT clang might have the only implementation that we can easily get non-trivial examples working with without either scanning sources at configure time, or dynamically generating command line options at compile time.

@jpakkane
Copy link
Member

jpakkane commented Jan 3, 2024

We could pretty easily have a rule of "if the name of your file == the name of your exported module you're good to go, otherwise you must pass the cpp_module_name parameter and Meson will map that to -fmodule-output=priv/<cpp_module_name>.pcm, which is not perfect, but makes the common case obvious (foo.cpp has the module foo) and isn't too bad in the less common case.

But in that case cpp_module_name would need to be per-file, right? One would imagine it to be pretty common that one target has many source files, each of which provides one module with most of them being internal modules that are not exported (though, obviously, there are going to be libs that export multiple module files).

@dcbaker
Copy link
Member Author

dcbaker commented Jan 3, 2024

Right, that would need to be per file rather than per target. Which, I admit is not ideal, but at least it's closer than what GCC and MSVC have, and the consumer side is what we'd like

@dcbaker dcbaker force-pushed the submit/two-step-dep-scanner branch from 3960b72 to f2f9c08 Compare March 7, 2024 00:27
@dcbaker dcbaker force-pushed the submit/two-step-dep-scanner branch 4 times, most recently from 8a2365f to fed1aa3 Compare March 8, 2024 00:08
@@ -12,7 +12,7 @@
import pickle
import zipfile, tarfile
import sys
from unittest import mock, SkipTest, skipIf, skipUnless
from unittest import mock, SkipTest, skipIf, skipUnless, expectedFailure

Check notice

Code scanning / CodeQL

Unused import Note

Import of 'expectedFailure' is not used.
@dcbaker dcbaker force-pushed the submit/two-step-dep-scanner branch from fed1aa3 to 49ce96e Compare March 8, 2024 17:41
mesonbuild/scripts/depaccumulate.py Outdated Show resolved Hide resolved
mesonbuild/scripts/depaccumulate.py Outdated Show resolved Hide resolved
mesonbuild/scripts/depaccumulate.py Outdated Show resolved Hide resolved
mesonbuild/scripts/depaccumulate.py Outdated Show resolved Hide resolved
@dcbaker dcbaker force-pushed the submit/two-step-dep-scanner branch 5 times, most recently from 4fc923e to c1a7e8b Compare August 26, 2024 20:42
But we really expect it to be a string once the initializer is done.
Therefore, let's set it to `''` just like we do with the scratchdir in
the case that it's set to None in the initializer.
Because we don't set the appropriate dependencies
Because we use this dependency to ensure that any binary module files
are generated, we must have a full dependency. Otherwise, if a new
module is added to a dependent target, and used in our target, we race
the generation of the binary module definition.
And fix some internal annotations
Otherwise only shared libraries get installed, which isn't correct.
Otherwise we again miss out on half of the targets we need
Otherwise they won't be able to find their module outputs.
…ned sources

It is not sufficient to have an order dependency on generated sources.
If any of those sources change we need to rescan, otherwise we may not
notice changes to modules.
This requires that every Fortran target that uses modules have a full
dependency on the scan target of it's dependencies. This means that for
a three step target `A -> B -> C`, we cannot start compiling any of B
until all of A is linked, and cannot start compiling any of C until
all of A and B is linked.

This fixes various kinds of races, but it serializes the build and makes
it slow. This is the best we can do though, since we don't have any sort
of portable format for telling C what is in A and B, so C can't know
what sources to wait on for it's modules to be fulfilled.
This splits the scanner into two discrete steps, one that scans the
source files, and one that that reads in the dependency information and
produces a dyndep.

The scanner uses the JSON format from
https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2022/p1689r5.html,
which is the same format the MSVC and Clang use for C++ modules
scanning. This will allow us to more easily move to using MSVC and
clang-scan-deps when possible.

As an added bonus, this correctly tracks dependencies across TU and
Target boundaries, unlike the previous implementation, which assumed
that if it couldn't find a provider that everything was good, but could
run into issues. Because of that limitation Fortran code had to fully
depend on all of it's dependencies, transitive or not. Now, when using
the dep scanner, we can remove that restriction, allowing more
parallelism.
We already test for custom_targets and configure_file, but we should
test a generator too
Since the comment saying we need a generic way to do this is a little
outdated.
@dcbaker dcbaker force-pushed the submit/two-step-dep-scanner branch from c1a7e8b to 6877a72 Compare August 27, 2024 15:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
language:c++ Issues specific to c++ language:fortran
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants