-
Notifications
You must be signed in to change notification settings - Fork 97
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
#254, fix space leak on collisions #258
#254, fix space leak on collisions #258
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Many, many thanks for the fix and the amazing regression tests! :)
I'm quite out of my depth here, so I have a lot of questions. Please don't put too much effort into addressing them, though. @treeowl's review will be much more important.
-> A.Array (Leaf k v) | ||
updateOrSnocWith f = updateOrSnocWithKey (const f) | ||
{-# INLINABLE updateOrSnocWith #-} | ||
|
||
updateOrSnocWithKey :: Eq k => (k -> v -> v -> v) -> k -> v -> A.Array (Leaf k v) | ||
updateOrSnocWithKey :: Eq k => (k -> v -> v -> (# v #)) -> k -> v -> A.Array (Leaf k v) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe add a reference to David's explanation of the unboxed-tuple trick here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We use the unboxed tuple trick in a bunch of places in HashMap, so it seems "expected knowledge" for working with this package.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps we should raise a general doc issue for this trick then. It's uncommon knowledge I think 😄
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think a blog post explaining the trick would be great. I wouldn't document it in the context of unordered-containers, but try and make it a general technique, and just link to it from this project.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, a blog post would be great. I have opened #262 as a reminder.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would call this updateOrSnocWithKey#
, and define updateOrSnocWithKey
to use it in the boring way. As it is, this function can only be used for the lazy operations. Should we extend it to let it work for strict ones too? The most semantically pleasing version would look like this:
updateOrSnocWith#
:: Eq k
=> ((# (# #) | v #) -> (# v #))
-> k
-> A.Array (Leaf k v)
-> A.Array (Leaf k v)
-> A.Array (Leaf k v)
This takes a function that we pass either (# (# #) | #)
(meaning the key was not present) or (# | v #)
(the value tied to the key in the map) and produces an unboxed unary tuple for the new value.
Unfortunately, we support GHC versions that don't have unboxed sums. We can work around that in one of two ways:
updateOrSnocWith#
:: Eq k
=> (Maybe v -> (# v #))
-> k
-> A.Array (Leaf k v)
-> A.Array (Leaf k v)
-> A.Array (Leaf k v)
updateOrSnocWith#
:: Eq k
=> (v -> (# v #))
-> k
-> ((# #) -> (# v #))
-> A.Array (Leaf k v)
-> A.Array (Leaf k v)
-> A.Array (Leaf k v)
Another issue is evaluating the whole v -> v
vs. v -> v -> v
sort of approach, which has to do with what closures we might have to create in different contexts. This will come down to poring over Core and doing some benchmarking. This is actually true whether we try to unify strict and lazy or not. Personally, I'd love to be able to unify as much as we can, if we can get enough of the right things to inline the way we want to avoid hurting performance.
Final note: if we're careful enough about inlining, we might well be able to just use boxed everything and still not pay for it. That means making sure that the underlying implementation inlines into the Strict
or Lazy
version as soon as the wrapped-up function is passed in, while that Strict
or Lazy
version itself remains (at least) INLINABLE
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it useful to add both updateOrSnocWith#
and updateOrSnocWith
? For user code, where we expose it, I agree that having a "friendly" wrapper is valuable. But given we have updateOrSnocWith
and updateOrSnocWithKey
that seems like a lot of intermediate helper functions that don't get used a whole lot.
My inclination would be to keep this patch (which fixes a space leak) a small self-contained space leak fix, and then follow up with the larger unboxed sums and looking at the Core after. It feels like a subsequent refactor, and one that I'm not well placed to make.
Lots more documentation about the test based on the review by @sjakobi. I also made the lazy test use undefined values to insert, checking that you don't have the space leak but still aren't strict. |
Will look today.
…On Mon, Jun 1, 2020, 7:30 AM Neil Mitchell ***@***.***> wrote:
Lots more documentation about the test based on the review by @sjakobi
<https://github.com/sjakobi>. I also made the lazy test use undefined
values to insert, checking that you don't have the space leak but still
aren't strict.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#258 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAOOF7ORSB7XYZGPLFURN4DRUOGMPANCNFSM4NPLCURQ>
.
|
So, I'm 👍 on this - I can't find anything in particular to gripe about. If @treeowl sees something to fix, i think he should be the final approval here. |
Data/HashMap/Base.hs
Outdated
@@ -1949,7 +1949,7 @@ updateOrSnocWithKey f k0 v0 ary0 = go k0 v0 ary0 0 (A.length ary0) | |||
A.write mary n (L k v) | |||
return mary | |||
| otherwise = case A.index ary i of | |||
(L kx y) | k == kx -> A.update ary i (L k (f k v y)) | |||
(L kx y) | k == kx, (# v2 #) <- f k v y -> A.update ary i (L k v2) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm still a bit mystified how the fix works, but this must be the crucial line.
By forcing the unboxed tuple, we can get rid of the reference to the old value y
, so it can be GC'd. Is that right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This line does the forcing, so it's half the story. Line 760 constructs \a _ -> (# a #)
which says how far to evaluate (at least as far as that the _
has disappeared). Together they get rid of the old value, that previously hung out in the _
, and is now not in the (# #)
value.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Speaking of this line, that case-plus guard can be rewritten
| L kx y <- A.index ary i
= if
| k == kx
, (# v2 #) <- f k v y
-> A.update ary i (L k v2)
| otherwise
-> go k v ary (i+1) n
Personally, I think this seems a little easier to follow.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I refactored to morally follow that pattern - the nested if
is not actually needed as the kx
and y
bound by A.index
aren't used (and importantly, shouldn't be used) in the go
recursive call - so I hoisted it all up a bit.
tests/Regressions.hs
Outdated
-- | ||
-- To make the test robust, it's important that oldV isn't hoisted up to the top or shared. To do that, | ||
-- we use NOINLINE, make oldV dependent on an unseen argument, and insert _ <- return () to ensure oldV | ||
-- is under a lambda. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yuck yuck yuck. I don't like this approach to avoiding floating. It seems much too fragile. Fortunately, I think we can do something both simpler and more robust, taking advantage of the surprising fact that type Assertion = IO ()
.
issue254Lazy :: Assertion
issue254Lazy = do
i :: Int <- randomIO
let oldV = error $ "...." ++ show i
....
Now GHC has no way to float out the old value because it's genuinely unknown until the assertion is run.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's way better. I've done that.
tests/Regressions.hs
Outdated
let oldV = error $ "Should not be evaluated: " ++ show i | ||
weakV <- mkWeakPtr oldV Nothing -- test whether oldV is alive | ||
let mp = HML.insert (KC 1) (error "Should not be evaluated") $ HML.fromList [(KC 0, "1"), (KC 1, oldV)] | ||
_ <- evaluate mp -- force the insert to happen |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should probably be written
mp <- evaluate $ HML.insert (KC 1) (error "Should not be evaluated") $ HML.fromList [(KC 0, "1"), (KC 1, oldV)]
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
I have no idea why |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One more nit. Apart from that, LGTM! 👍
Co-authored-by: Simon Jakobi <simon.jakobi@gmail.com>
Thanks! |
I want to add extra special thanks for the testing work. That was all sorts of tricky and subtle. |
Fixes #254. That benchmark used to leak 1Gb/s, now it doesn't leak and runs in 5Mb.
I added a test case. The invariants to ensure GHC doesn't hoist the memory out (and thus keep it alive) are quite subtle, but the result should be fairly robust. I confirm the test fails before, and passes after. Unfortunately, it's not as simple as testing for strictness, since the functions have the right strictness behaviour, it's just old strict values get retained longer than is necessary.