Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cleanup #13

Draft
wants to merge 41 commits into
base: develop
Choose a base branch
from
Draft

Cleanup #13

wants to merge 41 commits into from

Conversation

liss-h
Copy link
Contributor

@liss-h liss-h commented Aug 2, 2023

No description provided.

There is an issue with the implementation of potentially modifying methods:
accessing or reassigning an existing value (means no insertion) may trigger
a hash table rehash. This is because load factor thresholds are checked
unconfitionally in the common insertion routine.

Such behaviour leads to a violation of the following iterator invalidation
contract:

> - insert, emplace, emplace_hint, operator[]: if there is an effective
> insert, invalidate the iterators.

User code, being unaware of this bug, may suffer from use-after-free.

Fix the issue by moving threshold checks inside the insertion subroutine.
@liss-h liss-h requested a review from bigerl August 7, 2023 09:39
Copy link
Member

@bigerl bigerl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Partial Review sparse_hash.hpp (1)

cmake-build-type: Release
}
name: ${{matrix.config.name}}
# These get queued but never actually run
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When merged, we should add issues with tag help-wanted to add windows and macos support to the CI.

* Maximum that size_ can reach before a rehash occurs automatically
* to grow the hash table.
*/
size_type load_threshold_rehash_;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this (and the value below) really make a difference? Is it measurable slower if we calculate it on the fly?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes a little bit

Cached as member:
Bildschirmfoto vom 2023-08-08 11-58-18

Calculated on the fly:
Bildschirmfoto vom 2023-08-08 11-55-55

include/dice/sparse_map/internal/sparse_hash.hpp Outdated Show resolved Hide resolved
include/dice/sparse_map/internal/sparse_hash.hpp Outdated Show resolved Hide resolved
hasher const &hash,
key_equal const &equal,
allocator_type const &alloc) : h_{hash},
keq_{equal},
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does no_unique_address + this result in a no-op?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is optimizer dependent. In theory the ctor could still do something, even if it isn't initializing the object.

return erase(mutable_iterator(pos));
}

iterator erase(const_iterator first, const_iterator last) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This whole function doesn't make much sense to me. Erase can trigger a rehash, right? So we cannot even be sure that the range of entries that the user could observe before the call between first and last is even correctly erase.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I was wondering the same thing. I honestly don't know if erasing ranges from hash maps ever makes sense, I mean you don't even know whats in the range in the first place as its effectively random.

Just remove?

Copy link
Member

@bigerl bigerl Aug 8, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would keep the function but deactivate it with an requires false + an explanation why it is not supported ATM.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually no, it cannot trigger a rehash, because you couldn't do erase loops if it did.
But the function is still pretty useless I would say.

}

template<typename K>
size_type erase(K const &key) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TODO: continue review here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants