Remove hard_limit_for_bookkeeping #77480

PeterSolMS · 2022-10-26T10:34:00Z

As it turns out, it's not so easy to set aside a hard limit for the GC's bookkeeping, because some regions will be partially comitted and so will need more bookkeeping information as a percentage.

So this removes the hard_limit_for_bookkeeping and associated fields, and instead commits memory for the mark array eagerly, even if BGC is not in progress.

…C's bookkeeping, because some regions will be partially comitted and so will need more bookkeeping information as a percentage. So this removes the hard_limit_for_bookkeeping and associated fields, and instead commits memory for the mark array eagerly, even if BGC is not in progress.

ghost · 2022-10-26T10:34:19Z

Tagging subscribers to this area: @dotnet/gc
See info in area-owners.md if you want to be subscribed.

Issue Details

As it turns out, it's not so easy to set aside a hard limit for the GC's bookkeeping, because some regions will be partially comitted and so will need more bookkeeping information as a percentage.

So this removes the hard_limit_for_bookkeeping and associated fields, and instead commits memory for the mark array eagerly, even if BGC is not in progress.

Author:	PeterSolMS
Assignees:	PeterSolMS
Labels:	`area-GC-coreclr`
Milestone:	-

cshung

I liked this idea more than trying to set aside memory for bookkeeping. The one thing I am not sure about is that now we are committing the mark array for every region. If background GC is happening anyway, it might be fine because we will be committing them anyway, but what if we don't?

src/coreclr/gc/gcpriv.h

cshung · 2022-10-26T17:15:57Z

src/coreclr/gc/gc.cpp

+        dprintf (GC_TABLE_LOG, ("new seg %Ix, mark_array is %Ix",
+            heap_segment_mem (region), mark_array));
+        if (((region->flags & heap_segment_flags_ma_committed) == 0) &&
+            !commit_mark_array_new_seg (__this, region))


Shall we guard this under gc_can_use_concurrent?

commit_mark_array_new_seg shouldn't do anything if concurrent is disabled; although I'm looking at the code, we are not initializing background_saved_highest_address in init_gc_heap which we should. @PeterSolMS could you please add this (ie, init background_saved_highest_address and background_saved_lowest_address to 0 in init_gc_heap like we do with WKS GC)? we could still guard this with gc_can_use_concurrent just to indicate our intention but it's not essential.

I just pushed changes with the initialization of background_saved_lowest_address and background_saved_highest_address, and the restoration of current_total_committed_bookkeeping.

Maoni0 · 2022-10-26T19:56:58Z

If background GC is happening anyway, it might be fine because we will be committing them anyway, but what if we don't?

that is the drawback with doing this. but if concurrent GC is enabled, it's very likely we'll need to do a BGC at some point so it's not too bad.

…ments back. Initialize background_saved_lowest_address and background_saved_lowest_address. Removed unrelated bug fix which is addressed by a separate PR.

PeterSolMS · 2022-10-28T15:18:18Z

I think I found a hole in our logic - clear_region_info (called by return_free_region) will decommit the mark array in high memory load. When get_free_region tries to commit again, we'll fail and get the same AV as before. Perhaps this means we'll have to keep the mark array committed, i.e. disable the logic to decommit the mark array in high memory load?

…info. The issue is that GC needs at least one free region, and it needs the mark array committed. So when it tries to get a region at the end, and we cannot commit the mark array, we cannot actually get a free region. Currently we AV in this case, but there really isn't a good option to recover at this point. So it's better to keep the mark array committed and perhaps fail with an OOM exception.

PeterSolMS · 2022-11-16T11:10:58Z

With this change, we get much closer to actually exhausting the available memory in hard limit cases than before.

In a test case with hard limit set to 0x13c1d000 = 471,859,200, we only get to 331,468,800 committed bytes before running out of memory. The reason is that the memory set aside for bookkeeping is not sufficient, so we run into a situation where the mark array for a region can not be committed.

With this change, we get to 471,842,816 committed bytes before running out. "!dumpheap -stat" reports 462,803,767 bytes in objects, of which 10,684,992 are in free objects, so that we have a total of 452,118,775 bytes in useful objects. 6,078,464 are used for bookkeeping purposes (mainly for the mark array).

frankbuckley · 2022-12-13T09:14:16Z

With this change, we get much closer to actually exhausting the available memory in hard limit cases than before.

In a test case with hard limit set to 0x13c1d000 = 471,859,200, we only get to 331,468,800 committed bytes before running out of memory...

With this change, we get to 471,842,816 committed bytes before running out...

Will this be backported to 7.0?

mangod9 · 2023-04-24T17:14:58Z

/backport to release/7.0-staging

github-actions · 2023-04-24T17:15:13Z

Started backporting to release/7.0-staging: https://github.com/dotnet/runtime/actions/runs/4789151761

github-actions · 2023-04-24T17:17:46Z

@mangod9 backporting to release/7.0-staging failed, the patch most likely resulted in conflicts:

$ git am --3way --ignore-whitespace --keep-non-patch changes.patch

Applying: As it turns out, it's not so easy to set aside a hard limit for the GC's bookkeeping, because some regions will be partially comitted and so will need more bookkeeping information as a percentage.
Using index info to reconstruct a base tree...
M	src/coreclr/gc/gc.cpp
M	src/coreclr/gc/gcpriv.h
Falling back to patching base and 3-way merge...
Auto-merging src/coreclr/gc/gcpriv.h
Auto-merging src/coreclr/gc/gc.cpp
CONFLICT (content): Merge conflict in src/coreclr/gc/gc.cpp
error: Failed to merge in the changes.
hint: Use 'git am --show-current-patch=diff' to see the failed patch
Patch failed at 0001 As it turns out, it's not so easy to set aside a hard limit for the GC's bookkeeping, because some regions will be partially comitted and so will need more bookkeeping information as a percentage.
When you have resolved this problem, run "git am --continue".
If you prefer to skip this patch, run "git am --skip" instead.
To restore the original branch and stop patching, run "git am --abort".
Error: The process '/usr/bin/git' failed with exit code 128

Please backport manually!

github-actions · 2023-04-24T17:17:47Z

@mangod9 an error occurred while backporting to release/7.0-staging, please check the run log for details!

Error: git am failed, most likely due to a merge conflict.

PeterSolMS requested review from cshung, Maoni0 and mangod9 October 26, 2022 10:34

dotnet-issue-labeler bot added the area-GC-coreclr label Oct 26, 2022

ghost assigned PeterSolMS Oct 26, 2022

cshung reviewed Oct 26, 2022

View reviewed changes

cshung linked an issue Oct 26, 2022 that may be closed by this pull request

Fail to get_free_region in thread_final_regions due to low memory for mark array might lead to AV. #74401

Closed

Put current_total_committed_bookkeeping and its uses in dprintf state…

cd10a13

…ments back. Initialize background_saved_lowest_address and background_saved_lowest_address. Removed unrelated bug fix which is addressed by a separate PR.

Maoni0 approved these changes Dec 8, 2022

View reviewed changes

Merge with main.

ab64708

build-analysis bot mentioned this pull request Dec 9, 2022

Precondition failure: File has not had execution verified #79439

Closed

Merge branch 'main' into Remove_hard_limit_for_bookkeeping

1a6ad7d

PeterSolMS merged commit 57a48c6 into dotnet:main Dec 12, 2022

build-analysis bot mentioned this pull request Dec 12, 2022

Test failure JIT\\opt\\Tailcall\\TailcallVerifyWithPrefix\\TailcallVerifyWithPrefix.cmd #79517

Closed

cincuranet mentioned this pull request Dec 13, 2022

Regressions in System.Globalization.Tests.StringEquality #79617

Closed

ghost locked as resolved and limited conversation to collaborators Jan 12, 2023

Maoni0 mentioned this pull request Feb 8, 2023

.NET 7 application hangs during GC #80073

Closed

github-actions bot unlocked this conversation Apr 24, 2023

github-actions bot locked as resolved and limited conversation to collaborators Apr 24, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Remove hard_limit_for_bookkeeping #77480

Remove hard_limit_for_bookkeeping #77480

PeterSolMS commented Oct 26, 2022

ghost commented Oct 26, 2022

cshung left a comment

cshung Oct 26, 2022

Maoni0 Oct 26, 2022

PeterSolMS Oct 27, 2022

Maoni0 commented Oct 26, 2022

PeterSolMS commented Oct 28, 2022

PeterSolMS commented Nov 16, 2022

frankbuckley commented Dec 13, 2022

mangod9 commented Apr 24, 2023

github-actions bot commented Apr 24, 2023

github-actions bot commented Apr 24, 2023

github-actions bot commented Apr 24, 2023

Remove hard_limit_for_bookkeeping #77480

Remove hard_limit_for_bookkeeping #77480

Conversation

PeterSolMS commented Oct 26, 2022

ghost commented Oct 26, 2022

cshung left a comment

Choose a reason for hiding this comment

cshung Oct 26, 2022

Choose a reason for hiding this comment

Maoni0 Oct 26, 2022

Choose a reason for hiding this comment

PeterSolMS Oct 27, 2022

Choose a reason for hiding this comment

Maoni0 commented Oct 26, 2022

PeterSolMS commented Oct 28, 2022

PeterSolMS commented Nov 16, 2022

frankbuckley commented Dec 13, 2022

mangod9 commented Apr 24, 2023

github-actions bot commented Apr 24, 2023

github-actions bot commented Apr 24, 2023

github-actions bot commented Apr 24, 2023