Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Expose constrained memory limits and have the GC use them #46796

Merged
merged 3 commits into from
Sep 20, 2022
Merged

Conversation

maleadt
Copy link
Member

@maleadt maleadt commented Sep 16, 2022

This includes libuv/libuv#3744, which should make it possible to fix #24617. I still need to verify this, so marking this PR draft until I do.

According to @vtjnash, it should also be safe to backport this to JuliaLang/libuv#24 (comment).

@maleadt maleadt added domain:io Involving the I/O subsystem: libuv, read, write, etc. backport 1.8 Change should be backported to release-1.8 labels Sep 16, 2022
@maleadt
Copy link
Member Author

maleadt commented Sep 16, 2022

This works as expected:

❯ ./julia -e '@show Base.format_bytes(ccall(:uv_get_total_memory, UInt64, ())); @show Base.format_bytes(ccall(:uv_get_constrained_memory, UInt64, ()))'
Base.format_bytes(ccall(:uv_get_total_memory, UInt64, ())) = "62.222 GiB"
Base.format_bytes(ccall(:uv_get_constrained_memory, UInt64, ())) = "16384.000 PiB"

❯ systemd-run --user --scope -p MemoryLimit=3G ./julia -e '@show Base.format_bytes(ccall(:uv_get_total_memory, UInt64, ())); @show Base.format_bytes(ccall(:uv_get_constrained_memory, UInt64, ()))'
Running scope as unit: run-r06bcd75bb4bd4648be798311727f5fb5.scope
Base.format_bytes(ccall(:uv_get_total_memory, UInt64, ())) = "62.222 GiB"
Base.format_bytes(ccall(:uv_get_constrained_memory, UInt64, ())) = "3.000 GiB"

The GC should automatically use this, but in addition I guess we want to expose this to the Julia user. Anybody any thoughts on an interface? The easiest solution would be for Sys.total_memory to keep on returning the total memory amount and add a Sys.constrained_memory to return the constrained version. Although it might be simpler/user-friendlier to just change Sys.total_memory to return the effective amount (i.e. the constrained one if there's a limit).

@gbaraldi
Copy link
Member

I would be in favor of making total_memory reflect the constrained memory. Not sure how useful it is to know how much memory the whole machine has if one can't access it.

@KristofferC
Copy link
Sponsor Member

Not sure how useful it is to know how much memory the whole machine has if one can't access it.

And that could always be provided with a Sys.unconstrained_memory

@maleadt
Copy link
Member Author

maleadt commented Sep 16, 2022

Yeah I agree. I pushed a change to Sys.total_memory(), and added Sys.physical_memory() (bikeshed away).

We should probably make free_memory cgroup aware too, though, since it currently doesn't reflect that:

❯ ./julia -e '@show Base.format_bytes(ccall(:uv_get_total_memory, UInt64, ())); @show Base.format_bytes(ccall(:uv_get_constrained_memory, UInt64, ())); @show Base.format_bytes(ccall(:uv_get_free_memory, UInt64, ()))'
Base.format_bytes(ccall(:uv_get_total_memory, UInt64, ())) = "62.222 GiB"
Base.format_bytes(ccall(:uv_get_constrained_memory, UInt64, ())) = "16384.000 PiB"
Base.format_bytes(ccall(:uv_get_free_memory, UInt64, ())) = "56.181 GiB"

❯ systemd-run --user --scope -p MemoryLimit=3G ./julia -e '@show Base.format_bytes(ccall(:uv_get_total_memory, UInt64, ())); @show Base.format_bytes(ccall(:uv_get_constrained_memory, UInt64, ())); @show Base.format_bytes(ccall(:uv_get_free_memory, UInt64, ()))'
Running scope as unit: run-r369d2f0373524fef8f0cab15ec87f903.scope
Base.format_bytes(ccall(:uv_get_total_memory, UInt64, ())) = "62.222 GiB"
Base.format_bytes(ccall(:uv_get_constrained_memory, UInt64, ())) = "3.000 GiB"
Base.format_bytes(ccall(:uv_get_free_memory, UInt64, ())) = "56.123 GiB"

The GC also uses uv_get_free_memory, so this means there is more work to do.

@gbaraldi
Copy link
Member

It would require adding something like uv_get_free_constrained_memory or can we just do or own calculations?

@vtjnash
Copy link
Sponsor Member

vtjnash commented Sep 16, 2022

This may conflict with #46663, which includes #46086

@KristofferC
Copy link
Sponsor Member

Well I just added that now to make this backport more similar to the one on master. But in what way would it conflict?

@vtjnash
Copy link
Sponsor Member

vtjnash commented Sep 16, 2022

Sorry, I read it backwards. Right, so this will conflict less with the backports, once that is merged.

@maleadt
Copy link
Member Author

maleadt commented Sep 16, 2022

It would require adding something like uv_get_free_constrained_memory or can we just do or own calculations?

We need to query cgroup's memory.current, so ideally we add that to libuv. I've pinged the maintainer about that.

@ViralBShah
Copy link
Member

It would be great if 1.8.2 could include this.

@KristofferC
Copy link
Sponsor Member

It would be great if 1.8.2 could include this.

We want to release 1.8.2 quite soon so the question here is what the time frame in #46796 (comment):

so this means there is more work to do.

Upstream libuv is moving quite slow and I don't want 1.8.2 to have to wait a significant time for them to review something. So the options I see are:

  • Merge this as is and get it into 1.8.2 (even though there is still a small part left).
  • Get lucky and get a fast review of the remaining task so we can quickly put it in 1.8.2.
  • Leave this out of 1.8.2.

@KristofferC
Copy link
Sponsor Member

After speaking to @maleadt, it seems reasonable to put this behind a feature flag in 1.8.2 since changing the default GC behavior in a patch release is probably a bit too intrusive.

@ViralBShah
Copy link
Member

ViralBShah commented Sep 19, 2022

I believe the memory utilization is way higher in many workloads compared to 1.7. This should be the default, I feel.

Is it because of the small part left? And would be default on 1.8.3?

@maleadt maleadt marked this pull request as ready for review September 19, 2022 20:38
@maleadt maleadt changed the title Bump libuv to expose constrained memory limits Expose constrained memory limits and have the GC use them Sep 19, 2022
@maleadt
Copy link
Member Author

maleadt commented Sep 19, 2022

This PR should be complete. I'll do some more testing tomorrow and create a backport version for 1.8.

The only user-visible change is the addition of Sys.free_physical_memory and Sys.total_physical_memory as counterparts to Sys.free_memory and Sys.total_memory which now respect and memory constraints. As we're kind-of rushing this change (I'd like to merge this tomorrow), now is the change to bikeshed these names!

@maleadt maleadt merged commit 89c4a2a into master Sep 20, 2022
@maleadt maleadt deleted the tb/libuv branch September 20, 2022 07:05
maleadt added a commit that referenced this pull request Sep 20, 2022
@simonbyrne
Copy link
Contributor

I'm trying this on the current nightly (f927d25) on our Slurm cluster which enforces memory limits via cgroups, and it doesn't seem to match with what I see:

julia> Sys.total_physical_memory() / 2^20 # physical mem (MB)
257693.1796875

julia> Sys.total_memory() / 2^20  # should be memory limit (MB)
257693.1796875

julia> mem_cgroup = match(r".*:memory:(.*)", read("/proc/$(getpid())/cgroup", String)).captures[1] # current memory cgroup
"/slurm/uid_5184/job_28896711/step_0"

julia> parse(Int, readchomp("/sys/fs/cgroup/memory/$mem_cgroup/memory.limit_in_bytes")) / 2^20 # cgroup memory limit (MB)
2048.0

@simonbyrne
Copy link
Contributor

For reference:

julia> ccall(:uv_get_constrained_memory, UInt64, ())
0x0000000000000000

@simonbyrne
Copy link
Contributor

Ah, okay, it looks like we're not using cgroups2. 😦

@gbaraldi
Copy link
Member

gbaraldi commented Oct 7, 2022

Just as a test. Could you use Hwloc.jl to see what it prints out?

@simonbyrne
Copy link
Contributor

what command in particular?

julia> Hwloc.topology()
Machine (251.65 GB)
    Package L#0 P#0 (125.66 GB)
        NUMANode (125.66 GB)
        L3 (35.0 MB) + L2 (256.0 kB) + L1 (32.0 kB) + Core L#0 P#0
            PU L#0 P#0
    Package L#1 P#1 (126.0 GB)
        NUMANode (126.0 GB)

@gbaraldi
Copy link
Member

gbaraldi commented Oct 7, 2022

That was the one I was thinking of. I'm not sure they have others. But it seems it doesn't see the limit as well.

@simonbyrne
Copy link
Contributor

I have a proposed fix at JuliaLang/libuv#27, I'm not sure if it's correct though

@KristofferC KristofferC removed the backport 1.8 Change should be backported to release-1.8 label Oct 27, 2022
@maleadt maleadt mentioned this pull request Nov 25, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
domain:io Involving the I/O subsystem: libuv, read, write, etc.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

cgroup aware Base.Sys.cpu_info and Base.Sys.total_memory
6 participants