Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

pidlock cache file precompilation #49052

Merged

Conversation

IanButterworth
Copy link
Sponsor Member

@IanButterworth IanButterworth commented Mar 19, 2023

This avoids multiple processes entering precompile cache file generation for the same package at the same time, allowing the first to enter and the others will wait to use the cache file that the first generated.

Terminal 1

julia> @time using Revise
[ Info: Precompiling Revise [295af30f-e4ad-537b-8983-00126c2a3abe]
 18.165747 seconds (321.97 k allocations: 19.684 MiB)

Terminal 2 (started just after above)

julia> @time using Revise
[ Info: Waiting for another process to precompile Revise [295af30f-e4ad-537b-8983-00126c2a3abe]
 17.033804 seconds (325.99 k allocations: 19.775 MiB, 0.07% compilation time: 100% of which was recompilation)

julia> 

Perhaps the 2nd process should inform the user that it's waiting for another process to finish. Implemented

Making this specific to the package source was relatively straightforward. Making it specific to the resulting cache file path seems basically impossible because of the way the cache file name is generated.

@IanButterworth IanButterworth force-pushed the ib/precompille_pidlock branch 3 times, most recently from cca8c44 to 26aeddb Compare March 19, 2023 16:20
@IanButterworth IanButterworth marked this pull request as ready for review March 19, 2023 16:20
@IanButterworth IanButterworth force-pushed the ib/precompille_pidlock branch 2 times, most recently from 15bc157 to f3b7f5d Compare March 19, 2023 16:39
@KristofferC
Copy link
Sponsor Member

Perhaps the 2nd process should inform the user that it's waiting for another process to finish.

Was going to say this.

@IanButterworth
Copy link
Sponsor Member Author

Added

julia> using Revise
[ Info: Waiting for another process to precompile Revise [295af30f-e4ad-537b-8983-00126c2a3abe]

julia> 

@@ -38,6 +42,7 @@ Optional keyword arguments:
- `refresh`: Keeps a lock from becoming stale by updating the mtime every interval of time that passes.
By default, this is set to `stale_age/2`, which is the recommended value.
- `wait`: If true, block until we get the lock, if false, raise error if lock fails.
- `wait_msg`: Provide a string for it to be `@info` logged when waiting for the lock
Copy link
Sponsor Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe this should accept a function to print however wanted, as showing a @info log might be too opinionated?

@IanButterworth IanButterworth added the compiler:precompilation Precompilation of modules label Mar 20, 2023
@IanButterworth
Copy link
Sponsor Member Author

IanButterworth commented Mar 20, 2023

I was hoping this would mean one could

julia> @async Pkg.precompile(io=devnull)

julia> using Foo

where the environment contains packages bigger than Foo, for instance, but because Pkg.precompile precompiles depth first the using Foo could be first to the pidlock. Achieving that would require Pkg.precompile to take the pidlock early

@fredrikekre
Copy link
Member

Would it not make sense to have a trymkpidlock instead, and handle the calling of f_waited on the outside?

@IanButterworth
Copy link
Sponsor Member Author

I think that would require knowing which file you need to wait for. The path of the cachefile isn't determinable before spawning off the child process, AFAICT

@IanButterworth
Copy link
Sponsor Member Author

IanButterworth commented Mar 20, 2023

Oh, I see what you mean

function cachefile_lock(f_first::Function, f_waited::Function, pkg::Base.PkgId, path::String)
    pidfile = string(path, ".pidlock")
    if trymkpidlock(f_first, pidfile) == false
        mkpidlock(f_waited, pidfile; wait_msg = "Waiting for another process to precompile $pkg")
    end
end

@IanButterworth
Copy link
Sponsor Member Author

No urgency but I'm going to wait for Jameson's review/opinion before working more on this

@vtjnash
Copy link
Sponsor Member

vtjnash commented Mar 23, 2023

This seems like a great direction! Should we schedule a call to review details?

@IanButterworth
Copy link
Sponsor Member Author

IanButterworth commented May 20, 2023

@vtjnash I've tried to rewrite this to follow what we discussed

base/loading.jl Outdated Show resolved Hide resolved
base/loading.jl Outdated Show resolved Hide resolved
base/loading.jl Outdated Show resolved Hide resolved
@IanButterworth IanButterworth force-pushed the ib/precompille_pidlock branch 2 times, most recently from 8425af6 to 07ca9dc Compare June 2, 2023 02:37
base/loading.jl Outdated Show resolved Hide resolved
@IanButterworth
Copy link
Sponsor Member Author

@vtjnash this seems to be in a working state now.

For aligning Pkg.precompile I'm thinking it would use Base.maybe_cachefile_lock and two options come to mind:

  1. Allow Pkg.precompile to enter the parallel phase and lock at the dependency level. Maybe even add some text beside the package with the spinner indicating it's being precompiled elsewhere
  2. Prevent entering the parallel phase until all dependencies are not being precompiled elsewhere (needs some thought to not be racy)

@IanButterworth
Copy link
Sponsor Member Author

Bump @vtjnash

Copy link
Sponsor Member

@vtjnash vtjnash left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't love the trymkpidlock API of return Union{Bool,Any} or the naming of PidlockedError, but overall seems fine

@@ -1884,8 +1884,17 @@ function _require(pkg::PkgId, env=nothing)
@goto load_from_cache
end
# spawn off a new incremental pre-compile task for recursive `require` calls
cachefile = compilecache(pkg, path)
if isa(cachefile, Exception)
cachefile_or_module = maybe_cachefile_lock(pkg, path) do
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So in JuliaLang/Pkg.jl#3458 we found it to be possible for a fork-bomb to occur.
That makes me wonder how this will handle accidental recursion. Will it now just hang since the parent obtains the pid-lock, then shells out, and the child hangs?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could maybe even try to detect this specific deadlock scenario by calling getppid https://linux.die.net/man/3/getppid

The other question is how robust is this against cancellation, what happens on a killall -9 julia

@IanButterworth
Copy link
Sponsor Member Author

I'll add to NEWS later as this just manages non-interactive code load precompilation currently, needs a Pkg.precompile update

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
compiler:precompilation Precompilation of modules
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants