Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bevy_reflect: Replace HashTable with HashMap #15149

Closed

Conversation

MarcGuiselin
Copy link

Objective

This is preparatory work for removing hashbrown in favor of either std::collections or rustc_hash (yet undecided) as discussed in #11478

std::collections::{HashSet, HashMap} and rustc_hash::{FxHashSet, FxHashMap} are pretty much drop in replacements for hashbrown::{HashSet, HashMap}, so no issues there, but crates\bevy_reflect\src\set.rs uses hashbrown::HashTable which has no std/rustc_hash equivalent.

Solution

This pr replaces HashTable in DynamicSet with a HashMap. This is implemented similar to DynamicMap, which itself uses a HashMap internally.

Testing

Testing with cargo test --package bevy_reflect passes.

I haven't done any performance tests.

This change might consist of a small performance downgrade, since the old implementation used the u64 from PartialReflect::reflect_hash directly as a hash for the table (instead of using a hash of the u64). But since DynamicMap doesn't make any considerations to optimize things, I think this regression is fine.

Note: this is preparatory work for removing hashbrown in favor of std::collections
Copy link
Contributor

Welcome, new contributor!

Please make sure you've read our contributing guide and we look forward to reviewing your pull request shortly ✨

@MarcGuiselin MarcGuiselin changed the title Replace hashbrown::HashTable with hashbrown::HashMap bevy_reflect: Replace HashTable with HashMap Sep 10, 2024
@alice-i-cecile alice-i-cecile added C-Code-Quality A section of code that is hard to understand or change A-Reflection Runtime information about types S-Needs-Review Needs reviewer attention (from anyone!) to move forward labels Sep 10, 2024
@SkiFire13
Copy link
Contributor

SkiFire13 commented Sep 10, 2024

This changes the semantics. It will consider two elements with the same hash as being the same, resulting in one overwriting the other when inserted or its lookup succeeding even if the set only contains the other.

Since you mention DynamicMap, it also seems to be buggy and should be switched to use HashTable.

AFAIK the std HashMap doesn't have the flexibility to use .reflect_hash() or .reflect_partial_eq() to properly implement these types.

@MarcGuiselin
Copy link
Author

Since you mention DynamicMap, it also seems to be buggy and should be switched to use HashTable.

Well that makes a LOT more sense.

So we expect there to be hash collisions. Am I correct in assuming reflect_hash differs between different types, but not between different values of the same type? Or is it more nuanced than that?

I also think there might be an opportunity to combine DynamicMap and DynamicSet (or at least their internals) into a unified implementation, since a Set is just a map with empty values.

AFAIK the std HashMap doesn't have the flexibility to use .reflect_hash() or .reflect_partial_eq() to properly implement these types.

I began using a HashMap<u64, Vec<Box<dyn PartialReflect>>> to copy the implementation outright, but I switched due to assuming the extra complexity was unnecessary. I think that would work (but it makes iterating a bit difficult).

DynamicMap might be fixed by using a HashMap<u64, Vec<usize>> instead of the HashMap<u64, usize>

@SkiFire13
Copy link
Contributor

SkiFire13 commented Sep 10, 2024

So we expect there to be hash collisions. Am I correct in assuming reflect_hash differs between different types, but not between different values of the same type? Or is it more nuanced than that?

reflect_hash tries to differ between different values of the same type, but by the nature of hashing collisions cannot be prevented. So for correctness you have to compare with reflect_partial_eq in order to determine whether an entry is the same or not.

I also think there might be an opportunity to combine DynamicMap and DynamicSet (or at least their internals) into a unified implementation, since a Set is just a map with empty values.

I totally agree with this. There are also some weird differences between Map and Set that should be fixed, for example DynamicMap's implementation is heavely influence by the fact that Map supports a get_at method (which btw is not supported efficiently by neither HashMap nor BTreeMap, for which it is O(n)), while Set doesn't support it at all.

I began using a HashMap<u64, Vec<Box<dyn PartialReflect>>> to copy the implementation outright, but I switched due to assuming the extra complexity was unnecessary. I think that would work (but it makes iterating a bit difficult).

That could work, but looks really ugly and likely also less efficient than what it can be. It feels a waste to allocate all those Vec even though in practice collisions should be pretty rare.

The stdlib's HashMap is unfortunately very limited for this kind of things. It has an unstable raw_entry API that could work here, but it's being removed with hashbrown's HashTable as the suggested replacement.

Edit: a simplier alternative may be writing a wrapper over Box<dyn PartialReflect> that implements Hash and Eq deferring to reflect_hash and reflect_partial_eq. Then DynamicSet could simply store a HashSet<Wrapper>.

@MarcGuiselin
Copy link
Author

Thanks for the explanation! It cleared up many of my misunderstandings.

I think it's getting a little off topic for the original purpose of this pr.

Since hashbrown will remain a peer dependency for the forseeable future (making this pr a bit of a fool's errand haha), and this implementation is incorrect, I'll close this pr.

Regarding the incorrectness of DynamicMap and potential improvements on that end a new issue should be opened. It's getting very late for me, so I might create one tomorrow. Or feel free to create one.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-Reflection Runtime information about types C-Code-Quality A section of code that is hard to understand or change S-Needs-Review Needs reviewer attention (from anyone!) to move forward
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants