Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[MC] Allocate MCFragment with a bump allocator #96402

Merged
merged 4 commits into from
Jun 22, 2024

Conversation

MaskRay
Copy link
Member

@MaskRay MaskRay commented Jun 22, 2024

#95197 and 7500646 eliminated all raw
new MCXXXFragment. We can now place fragments in a bump allocator.
In addition, remove the dead Kind == FragmentType(~0) condition.

~CodeViewContext may call StrTabFragment->destroy() and need to be
reset before FragmentAllocator.Reset().
Tested by llvm/test/MC/COFF/cv-compiler-info.ll using asan.

Created using spr 1.3.5-bogner
@MaskRay MaskRay requested a review from aengelke June 22, 2024 17:50
@llvmbot llvmbot added the mc Machine (object) code label Jun 22, 2024
@llvmbot
Copy link
Collaborator

llvmbot commented Jun 22, 2024

@llvm/pr-subscribers-mc

Author: Fangrui Song (MaskRay)

Changes

#95197 and 7500646 eliminated all raw
new MCXXXFragment. We can now place fragments in a bump allocator.


Full diff: https://github.com/llvm/llvm-project/pull/96402.diff

3 Files Affected:

  • (modified) llvm/include/llvm/MC/MCContext.h (+5-1)
  • (modified) llvm/lib/MC/MCCodeView.cpp (+2-2)
  • (modified) llvm/lib/MC/MCFragment.cpp (+17-19)
diff --git a/llvm/include/llvm/MC/MCContext.h b/llvm/include/llvm/MC/MCContext.h
index 6c977a5bc0f1f..0f18ce5df1701 100644
--- a/llvm/include/llvm/MC/MCContext.h
+++ b/llvm/include/llvm/MC/MCContext.h
@@ -136,6 +136,9 @@ class MCContext {
   /// objects.
   BumpPtrAllocator Allocator;
 
+  /// For MCFragment instances.
+  BumpPtrAllocator FragmentAllocator;
+
   SpecificBumpPtrAllocator<MCSectionCOFF> COFFAllocator;
   SpecificBumpPtrAllocator<MCSectionDXContainer> DXCAllocator;
   SpecificBumpPtrAllocator<MCSectionELF> ELFAllocator;
@@ -432,7 +435,8 @@ class MCContext {
   MCInst *createMCInst();
 
   template <typename F, typename... Args> F *allocFragment(Args &&...args) {
-    return new F(std::forward<Args>(args)...);
+    return new (FragmentAllocator.Allocate(sizeof(F), alignof(F)))
+        F(std::forward<Args>(args)...);
   }
 
   /// \name Symbol Management
diff --git a/llvm/lib/MC/MCCodeView.cpp b/llvm/lib/MC/MCCodeView.cpp
index 713ae07c3a730..89b28b4da575f 100644
--- a/llvm/lib/MC/MCCodeView.cpp
+++ b/llvm/lib/MC/MCCodeView.cpp
@@ -29,8 +29,8 @@ using namespace llvm::codeview;
 CodeViewContext::~CodeViewContext() {
   // If someone inserted strings into the string table but never actually
   // emitted them somewhere, clean up the fragment.
-  if (!InsertedStrTabFragment)
-    delete StrTabFragment;
+  if (!InsertedStrTabFragment && StrTabFragment)
+    StrTabFragment->destroy();
 }
 
 /// This is a valid number for use with .cv_loc if we've already seen a .cv_file
diff --git a/llvm/lib/MC/MCFragment.cpp b/llvm/lib/MC/MCFragment.cpp
index d90fd34815bda..5347d99e2a4f3 100644
--- a/llvm/lib/MC/MCFragment.cpp
+++ b/llvm/lib/MC/MCFragment.cpp
@@ -203,59 +203,57 @@ MCFragment::MCFragment(FragmentType Kind, bool HasInstructions)
 
 void MCFragment::destroy() {
   // First check if we are the sentinel.
-  if (Kind == FragmentType(~0)) {
-    delete this;
+  if (Kind == FragmentType(~0))
     return;
-  }
 
   switch (Kind) {
     case FT_Align:
-      delete cast<MCAlignFragment>(this);
+      cast<MCAlignFragment>(this)->~MCAlignFragment();
       return;
     case FT_Data:
-      delete cast<MCDataFragment>(this);
+      cast<MCDataFragment>(this)->~MCDataFragment();
       return;
     case FT_CompactEncodedInst:
-      delete cast<MCCompactEncodedInstFragment>(this);
+      cast<MCCompactEncodedInstFragment>(this)->~MCCompactEncodedInstFragment();
       return;
     case FT_Fill:
-      delete cast<MCFillFragment>(this);
+      cast<MCFillFragment>(this)->~MCFillFragment();
       return;
     case FT_Nops:
-      delete cast<MCNopsFragment>(this);
+      cast<MCNopsFragment>(this)->~MCNopsFragment();
       return;
     case FT_Relaxable:
-      delete cast<MCRelaxableFragment>(this);
+      cast<MCRelaxableFragment>(this)->~MCRelaxableFragment();
       return;
     case FT_Org:
-      delete cast<MCOrgFragment>(this);
+      cast<MCOrgFragment>(this)->~MCOrgFragment();
       return;
     case FT_Dwarf:
-      delete cast<MCDwarfLineAddrFragment>(this);
+      cast<MCDwarfLineAddrFragment>(this)->~MCDwarfLineAddrFragment();
       return;
     case FT_DwarfFrame:
-      delete cast<MCDwarfCallFrameFragment>(this);
+      cast<MCDwarfCallFrameFragment>(this)->~MCDwarfCallFrameFragment();
       return;
     case FT_LEB:
-      delete cast<MCLEBFragment>(this);
+      cast<MCLEBFragment>(this)->~MCLEBFragment();
       return;
     case FT_BoundaryAlign:
-      delete cast<MCBoundaryAlignFragment>(this);
+      cast<MCBoundaryAlignFragment>(this)->~MCBoundaryAlignFragment();
       return;
     case FT_SymbolId:
-      delete cast<MCSymbolIdFragment>(this);
+      cast<MCSymbolIdFragment>(this)->~MCSymbolIdFragment();
       return;
     case FT_CVInlineLines:
-      delete cast<MCCVInlineLineTableFragment>(this);
+      cast<MCCVInlineLineTableFragment>(this)->~MCCVInlineLineTableFragment();
       return;
     case FT_CVDefRange:
-      delete cast<MCCVDefRangeFragment>(this);
+      cast<MCCVDefRangeFragment>(this)->~MCCVDefRangeFragment();
       return;
     case FT_PseudoProbe:
-      delete cast<MCPseudoProbeAddrFragment>(this);
+      cast<MCPseudoProbeAddrFragment>(this)->~MCPseudoProbeAddrFragment();
       return;
     case FT_Dummy:
-      delete cast<MCDummyFragment>(this);
+      cast<MCDummyFragment>(this)->~MCDummyFragment();
       return;
   }
 }

@MaskRay MaskRay changed the title [MC] MCFragment in bump allocator [MC] Allocate MCFragment with a bump allocator Jun 22, 2024
Copy link
Contributor

@aengelke aengelke left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, with two nits/questions?

Created using spr 1.3.5-bogner
Created using spr 1.3.5-bogner
Created using spr 1.3.5-bogner
@MaskRay MaskRay merged commit 8cb6e58 into main Jun 22, 2024
3 of 4 checks passed
@MaskRay MaskRay deleted the users/MaskRay/spr/mc-mcfragment-in-bump-allocator branch June 22, 2024 19:54
@nikic
Copy link
Contributor

nikic commented Jun 22, 2024

It looks like this change has massively increased memory usage, see https://llvm-compile-time-tracker.com/compare.php?from=fc23564c44f3eff1847462253d43c08b85489148&to=8cb6e587fd40b97983e445ee3f17f4c6d7190df7&stat=max-rss&linkStats=on. ReleaseLTO-g linking max-rss goes up by up to 7%.

@MaskRay
Copy link
Member Author

MaskRay commented Jun 22, 2024

Interesting, so there is tiny instruction:u decrease with some max-rss increase, significant for stage1-ReleaseLTO-g (link only).

This increase likely stems from LTO optimizations allocating objects on the heap that overlap significantly with newly created MCDataFragment instances.
The bump allocator eliminates this overlap and increases max-rss.
This can be mitigated if we can identify the heap objects and switch the optimization pass(es) to bump allocators.

This bump allocator change opens doors for further performance (e.g. more elegant -disable-free for fragments) and max-rss improvement, so I want to keep it despite the ReleaseLTO max-rss regression...
We can view this as using the previous saved credit from max-rss optimizations to unlock future gains...

@nikic
Copy link
Contributor

nikic commented Jun 23, 2024

This increase likely stems from LTO optimizations allocating objects on the heap that overlap significantly with newly created MCDataFragment instances. The bump allocator eliminates this overlap and increases max-rss. This can be mitigated if we can identify the heap objects and switch the optimization pass(es) to bump allocators.

So you're saying that there is no increase in memory usage per se, and the max-rss increase is because the allocator cannot reuse already mapped memory (e.g. because it is reserved for smaller type classes, or whatever)? Have you verified that this is the case with massif?

Note that the max-rss increase is not limited to LTO only, it is also seen at -O0, to a lesser degree. Backend memory usage increases are just always easiest to see in the LTO configuration, as backend tends to dominate max-rss there.

This bump allocator change opens doors for further performance (e.g. more elegant -disable-free for fragments) and max-rss improvement, so I want to keep it despite the ReleaseLTO max-rss regression...

I think this regression may be too large to keep it for speculative future gains. LTO memory usage is a pretty big problem, and the increase here is not small.

Regarding -disable-free, that would not work for frontends that codegen on multiple threads, right? Including rustc and lld with ThinLTO. If so, I'm not sure this is a useful optimization.

@aengelke
Copy link
Contributor

I'm admittedly quite surprised about the max-rss increase -- that shouldn't happen (at least not to such an extent), because fragments are still freed when sections are freed, which is done by MCContext::reset(), which will free the allocated memory for fragments. Sections are freed before fragments (at least, if DoAutoReset is set, which it looks like it is? If it's not, that might be problematic...), so fragment destructors should get called and free associated contents/fixups. I guess we are overlooking something. Will see if I find time to look at this tomorrow.

@aengelke
Copy link
Contributor

I had a brief look at this now. In some cases, I could reproduce a slight max-rss increase in some cases, but I couldn't track down an actual increase in heap memory usage (massif) -- in fact, it shows a slight reduction. My current hypothesis would be memory fragmentation in the glibc heap (small holes are no longer re-used for fragments). But I wasn't successful in tracking down the cause. I won't have time for further investigation over the next days.

AlexisPerry pushed a commit to llvm-project-tlp/llvm-project that referenced this pull request Jul 9, 2024
llvm#95197 and 7500646 eliminated all raw
`new MCXXXFragment`. We can now place fragments in a bump allocator.
In addition, remove the dead `Kind == FragmentType(~0)` condition.

~CodeViewContext may call `StrTabFragment->destroy()` and need to be
reset before `FragmentAllocator.Reset()`.
Tested by llvm/test/MC/COFF/cv-compiler-info.ll using asan.

Pull Request: llvm#96402
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
mc Machine (object) code
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants