Skip to content

Commit

Permalink
Prepare for 0.3 release (#39)
Browse files Browse the repository at this point in the history
* Clean up various iterator types

**Description**
 - Add a default PREFIX_LEN to all iterator types with a value of 16
 - Remove most instances of `*Keys`, `*Values`, `*ValuesMut`
   iterators
 - Rename the tree iterators to match the `std::collection::BTreeMap`
   ones
 - Add `#[non_exhaustive]` to the well-formed visitor error type

**Motivation**
 - Adding a default `PREFIX_LEN` biases towards making it easy to
   provide a type for the iterators in the common case, without the
   user ever needing to learn about the prefix or what the default
   length is.
 - I don't really like all the extra iterators, and I'm only keeping
   the remaining ones because `BTreeMap` has them
 - I also wanted to match the tree iterators names to have the same
   as in the stdlib, making it easier to drop-in replace using this
   crate
 - Adding `#[non_exhaustive]` makes future additions to the enum
   non-breaking changes

**Testing Done**
`./scripts/full-test.sh nightly`

* Ensure all iterators and map-related types are Send/Sync

**Description**
Add tests and unsafe impls to ensure that the `TreeMap` and related
types are `Send` and `Sync`.

**Motivation**
This is needed to match the interface of the `BTreeMap` and to make
sure we don't regress on this behavior between releases.

The `unsafe impl`s are not my favorite way to implement this, I would
have preferred that the building block types were naturally `Send` or
`Sync`, that wasn't easily done. Maybe a future improvement to remove
those.

**Testing Done**
`./scripts/full-test.sh nightly`

* Fix formatting for new nightly

* Update CHANGELOG

* Fix release configuration
  • Loading branch information
declanvk committed Sep 21, 2024
1 parent d709e3d commit bbc4f24
Show file tree
Hide file tree
Showing 15 changed files with 621 additions and 172 deletions.
16 changes: 16 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,22 @@ and this project adheres to [Semantic Versioning](http://semver.org/).

## [Unreleased] - ReleaseDate

### Added

- Added `TreeMap::{range, range_mut}` iterators. These iterators allow for querying sub-sections of the trie, using the natural keys as bounds and even the Rust range syntax.

### Changed

- Marked `MalformedTreeError` as `#[non_exhaustive]` and added some new variants relating to a linked list of leaf nodes.
- Modified the leaf nodes to have a doubly-linked list for fast iteration. This added some extra work in the insert and delete operations, but made it a lot cheaper to iterate. See [#33](https://github.com/declanvk/blart/pull/33) for more details and benchmarks
- Modified the existing `TreeMap::{iter, iter_mut, prefix, prefix_mut, into_iter, into_keys, into_values, iter, iter_mut, key, values, values_mut}` to use this new linked list to speed up iteration
- Optimize lookup (and "find delete point" and range iteration) by only reading prefix bytes from the inner node header. Previously, when a lookup encountered a prefix that was longer than the fixed number of bytes in the header, the procedure would go look up those missing bytes from a descendant leaf node.
- Now these functions will track whether or not these implicit bytes are used and just perform a final comparison with the leaf node key to make sure there were no mismatches. This "optimistic" behavior gets the lookup to the leaf node quicker, but could hit false positives in some cases.

### Removed

- Removed the `TreeMap::{fuzzy_keys, fuzzy_values, fuzzy_values_mut, prefix_keys, prefix_values, prefix_values_mut}` and the associated iterator types. These functions didn't provide a lot of added value versus just appending a `.map(...)` combinator on one of the iterators that return the key-value tuples.

## [0.2.0] - 2024-08-18

The 0.2.0 has been entirely (99%) contributed by @Gab-Menezes, thank you for all the new features!
Expand Down
2 changes: 1 addition & 1 deletion release.toml
Original file line number Diff line number Diff line change
Expand Up @@ -3,5 +3,5 @@ pre-release-replacements = [
{ file = "CHANGELOG.md", search = "\\.\\.\\.HEAD", replace = "...{{tag_name}}", exactly = 1 },
{ file = "CHANGELOG.md", search = "ReleaseDate", replace = "{{date}}" },
{ file = "CHANGELOG.md", search = "<!-- next-header -->", replace = "<!-- next-header -->\n\n## [Unreleased] - ReleaseDate", exactly = 1 },
{ file = "CHANGELOG.md", search = "<!-- next-url -->", replace = "<!-- next-url -->\n[Unreleased]: https://github.com/declanvk/wall-a/compare/{{tag_name}}...HEAD", exactly = 1 },
{ file = "CHANGELOG.md", search = "<!-- next-url -->", replace = "<!-- next-url -->\n[Unreleased]: https://github.com/declanvk/blart/compare/{{tag_name}}...HEAD", exactly = 1 },
]
129 changes: 26 additions & 103 deletions src/collections/map.rs
Original file line number Diff line number Diff line change
Expand Up @@ -26,8 +26,10 @@ pub use entry::*;
pub use entry_ref::*;
pub use iterators::*;

const DEFAULT_PREFIX_LEN: usize = 16;

/// An ordered map based on an adaptive radix tree.
pub struct TreeMap<K, V, const PREFIX_LEN: usize = 16> {
pub struct TreeMap<K, V, const PREFIX_LEN: usize = DEFAULT_PREFIX_LEN> {
/// The number of entries present in the tree.
num_entries: usize,
/// A pointer to the tree root, if present.
Expand Down Expand Up @@ -379,102 +381,6 @@ impl<K, V, const PREFIX_LEN: usize> TreeMap<K, V, PREFIX_LEN> {
FuzzyMut::new(self, key.as_bytes(), max_edit_dist)
}

/// Makes a fuzzy search in the tree by `key`,
/// returning all keys and values that are
/// less than or equal to `max_edit_dist`
///
/// This is done by using Levenshtein distance
///
/// # Examples
///
/// ```rust
/// use blart::TreeMap;
///
/// let mut map: TreeMap<_, _> = TreeMap::new();
///
/// map.insert(c"abc", 0);
/// map.insert(c"abd", 1);
/// map.insert(c"abdefg", 2);
///
/// let fuzzy: Vec<_> = map.fuzzy_keys(c"ab", 2).collect();
/// assert_eq!(fuzzy, vec![&c"abd", &c"abc"]);
/// ```
pub fn fuzzy_keys<'a, 'b, Q>(
&'a self,
key: &'b Q,
max_edit_dist: usize,
) -> FuzzyKeys<'a, 'b, K, V, PREFIX_LEN>
where
K: Borrow<Q> + AsBytes,
Q: AsBytes + ?Sized,
{
FuzzyKeys::new(self, key.as_bytes(), max_edit_dist)
}

/// Makes a fuzzy search in the tree by `key`,
/// returning all keys and values that are
/// less than or equal to `max_edit_dist`.
///
/// This is done by using Levenshtein distance
///
/// # Examples
///
/// ```rust
/// use blart::TreeMap;
///
/// let mut map: TreeMap<_, _> = TreeMap::new();
///
/// map.insert(c"abc", 0);
/// map.insert(c"abd", 1);
/// map.insert(c"abdefg", 2);
///
/// let fuzzy: Vec<_> = map.fuzzy_values(c"ab", 2).collect();
/// assert_eq!(fuzzy, vec![&1, &0]);
/// ```
pub fn fuzzy_values<'a, 'b, Q>(
&'a self,
key: &'b Q,
max_edit_dist: usize,
) -> FuzzyValues<'a, 'b, K, V, PREFIX_LEN>
where
K: Borrow<Q> + AsBytes,
Q: AsBytes + ?Sized,
{
FuzzyValues::new(self, key.as_bytes(), max_edit_dist)
}

/// Makes a fuzzy search in the tree by `key`,
/// returning all keys and values that are
/// less than or equal to `max_edit_dist`
///
/// This is done by using Levenshtein distance
///
/// # Examples
///
/// ```rust
/// use blart::TreeMap;
///
/// let mut map: TreeMap<_, _> = TreeMap::new();
///
/// map.insert(c"abc", 0);
/// map.insert(c"abd", 1);
/// map.insert(c"abdefg", 2);
///
/// let fuzzy: Vec<_> = map.fuzzy_values(c"ab", 2).collect();
/// assert_eq!(fuzzy, vec![&mut 1, &mut 0]);
/// ```
pub fn fuzzy_values_mut<'a, 'b, Q>(
&'a mut self,
key: &'b Q,
max_edit_dist: usize,
) -> FuzzyValuesMut<'a, 'b, K, V, PREFIX_LEN>
where
K: Borrow<Q> + AsBytes,
Q: AsBytes + ?Sized,
{
FuzzyValuesMut::new(self, key.as_bytes(), max_edit_dist)
}

/// Returns true if the map contains a value for the specified key.
///
/// # Examples
Expand Down Expand Up @@ -1152,8 +1058,8 @@ impl<K, V, const PREFIX_LEN: usize> TreeMap<K, V, PREFIX_LEN> {
/// assert_eq!(iter.next().unwrap(), (&4, &'z'));
/// assert_eq!(iter.next(), None);
/// ```
pub fn iter(&self) -> TreeIterator<'_, K, V, PREFIX_LEN> {
TreeIterator::new(self)
pub fn iter(&self) -> Iter<'_, K, V, PREFIX_LEN> {
Iter::new(self)
}

/// Gets a mutable iterator over the entries of the map, sorted by key.
Expand All @@ -1177,8 +1083,8 @@ impl<K, V, const PREFIX_LEN: usize> TreeMap<K, V, PREFIX_LEN> {
/// assert_eq!(map[&3], 'A');
/// assert_eq!(map[&4], 'Z');
/// ```
pub fn iter_mut(&mut self) -> TreeIteratorMut<'_, K, V, PREFIX_LEN> {
TreeIteratorMut::new(self)
pub fn iter_mut(&mut self) -> IterMut<'_, K, V, PREFIX_LEN> {
IterMut::new(self)
}

/// Gets an iterator over the keys of the map, in sorted order.
Expand Down Expand Up @@ -1568,7 +1474,7 @@ where
}

impl<'a, K, V, const PREFIX_LEN: usize> IntoIterator for &'a TreeMap<K, V, PREFIX_LEN> {
type IntoIter = TreeIterator<'a, K, V, PREFIX_LEN>;
type IntoIter = Iter<'a, K, V, PREFIX_LEN>;
type Item = (&'a K, &'a V);

fn into_iter(self) -> Self::IntoIter {
Expand All @@ -1577,7 +1483,7 @@ impl<'a, K, V, const PREFIX_LEN: usize> IntoIterator for &'a TreeMap<K, V, PREFI
}

impl<'a, K, V, const PREFIX_LEN: usize> IntoIterator for &'a mut TreeMap<K, V, PREFIX_LEN> {
type IntoIter = TreeIteratorMut<'a, K, V, PREFIX_LEN>;
type IntoIter = IterMut<'a, K, V, PREFIX_LEN>;
type Item = (&'a K, &'a mut V);

fn into_iter(self) -> Self::IntoIter {
Expand Down Expand Up @@ -1665,6 +1571,23 @@ mod tests {

use super::*;

#[test]
fn map_is_send_sync() {
fn is_send<T: Send>() {}
fn is_sync<T: Sync>() {}

fn map_is_send<K: Send, V: Send>() {
is_send::<TreeMap<K, V>>();
}

fn map_is_sync<K: Sync, V: Sync>() {
is_sync::<TreeMap<K, V>>();
}

map_is_send::<[u8; 3], usize>();
map_is_sync::<[u8; 3], usize>();
}

#[test]
fn tree_map_create_empty() {
let map = TreeMap::<Box<[u8]>, ()>::new();
Expand Down
65 changes: 50 additions & 15 deletions src/collections/map/entry.rs
Original file line number Diff line number Diff line change
Expand Up @@ -2,12 +2,11 @@ use std::mem::replace;

use crate::{AsBytes, DeletePoint, InsertPoint, LeafNode, NodePtr, OpaqueNodePtr, TreeMap};

use super::DEFAULT_PREFIX_LEN;

/// A view into an occupied entry in a [`TreeMap`]. It is part of the [`Entry`]
/// enum.
pub struct OccupiedEntry<'a, K, V, const PREFIX_LEN: usize>
where
K: AsBytes,
{
pub struct OccupiedEntry<'a, K, V, const PREFIX_LEN: usize = DEFAULT_PREFIX_LEN> {
pub(crate) leaf_node_ptr: NodePtr<PREFIX_LEN, LeafNode<K, V, PREFIX_LEN>>,

/// Used for the removal
Expand All @@ -18,6 +17,20 @@ where
pub(crate) parent_ptr_and_child_key_byte: Option<(OpaqueNodePtr<K, V, PREFIX_LEN>, u8)>,
}

// SAFETY: This struct contains a `&mut TreeMap<K, V>` which mean `K` and `V`
// must be `Send` for the struct to be `Send`.
unsafe impl<'a, K: Send, V: Send, const PREFIX_LEN: usize> Send
for OccupiedEntry<'a, K, V, PREFIX_LEN>
{
}

// SAFETY: This type has no interior mutability, and requires all internally
// referenced types to be `Sync` for the whole thing to be `Sync`.
unsafe impl<'a, K: Sync, V: Sync, const PREFIX_LEN: usize> Sync
for OccupiedEntry<'a, K, V, PREFIX_LEN>
{
}

impl<'a, K, V, const PREFIX_LEN: usize> OccupiedEntry<'a, K, V, PREFIX_LEN>
where
K: AsBytes,
Expand Down Expand Up @@ -87,19 +100,27 @@ where

/// A view into a vacant entry in a [`TreeMap`]. It is part of the [`Entry`]
/// enum.
pub struct VacantEntry<'a, K, V, const PREFIX_LEN: usize>
where
K: AsBytes,
{
pub struct VacantEntry<'a, K, V, const PREFIX_LEN: usize = DEFAULT_PREFIX_LEN> {
pub(crate) map: &'a mut TreeMap<K, V, PREFIX_LEN>,
pub(crate) key: K,
pub(crate) insert_point: Option<InsertPoint<K, V, PREFIX_LEN>>,
}

impl<'a, K, V, const PREFIX_LEN: usize> VacantEntry<'a, K, V, PREFIX_LEN>
where
K: AsBytes,
// SAFETY: This struct contains a `&mut TreeMap<K, V>` which mean `K` and `V`
// must be `Send` for the struct to be `Send`.
unsafe impl<'a, K: Send, V: Send, const PREFIX_LEN: usize> Send
for VacantEntry<'a, K, V, PREFIX_LEN>
{
}

// SAFETY: This type has no interior mutability, and requires all internally
// referenced types to be `Sync` for the whole thing to be `Sync`.
unsafe impl<'a, K: Sync, V: Sync, const PREFIX_LEN: usize> Sync
for VacantEntry<'a, K, V, PREFIX_LEN>
{
}

impl<'a, K: AsBytes, V, const PREFIX_LEN: usize> VacantEntry<'a, K, V, PREFIX_LEN> {
/// Sets the value of the entry with the [`VacantEntry`]’s key, and returns
/// a mutable reference to it.
pub fn insert(self, value: V) -> &'a mut V {
Expand Down Expand Up @@ -148,10 +169,7 @@ where
/// A view into a single entry in a map, which may either be vacant or occupied.
///
/// This enum is constructed from the [`TreeMap::entry`].
pub enum Entry<'a, K, V, const PREFIX_LEN: usize>
where
K: AsBytes,
{
pub enum Entry<'a, K, V, const PREFIX_LEN: usize = DEFAULT_PREFIX_LEN> {
/// A view into an occupied entry in a [`TreeMap`].
Occupied(OccupiedEntry<'a, K, V, PREFIX_LEN>),
/// A view into a vacant entry in a [`TreeMap`].
Expand Down Expand Up @@ -305,6 +323,23 @@ mod tests {

use super::*;

#[test]
fn iterators_are_send_sync() {
fn is_send<T: Send>() {}
fn is_sync<T: Sync>() {}

fn entry_is_send<'a, K: Send + 'a, V: Send + 'a>() {
is_send::<Entry<'a, K, V>>();
}

fn entry_is_sync<'a, K: Sync + 'a, V: Sync + 'a>() {
is_sync::<Entry<'a, K, V>>();
}

entry_is_send::<[u8; 3], usize>();
entry_is_sync::<[u8; 3], usize>();
}

#[test]
fn and_modify() {
let mut tree: TreeMap<_, _> = TreeMap::new();
Expand Down
Loading

0 comments on commit bbc4f24

Please sign in to comment.