Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reduce bounds checks in IndexVec #109

Open
overlookmotel opened this issue Sep 10, 2024 · 2 comments
Open

Reduce bounds checks in IndexVec #109

overlookmotel opened this issue Sep 10, 2024 · 2 comments

Comments

@overlookmotel
Copy link

overlookmotel commented Sep 10, 2024

oxc_index::IndexVec contains the following implementation of push:

pub fn push(&mut self, d: T) -> I {
    let idx = I::from_usize(self.len());
    self.raw.push(d);
    idx
}

AstNodeId, ScopeId, SymbolId and ReferenceId are used as indexes on IndexVecs. All are wrappers around NonMaxU32. NonMaxU32::from_usize bounds checks self.len() and panics if it's >= u32::MAX.

IndexVec::push is used extensively, so this extra panicking branch:

  1. Is on a hot path.
  2. Increases the size of push function, which makes it less likely to be inlined.

Vec::push already checks len vs capacity, and has a cold branch where the Vec needs to grow.

We can remove the extra bounds check in IndexVec::push by moving the bounds check into this cold path. As follows:

1. Extend Idx trait

trait Idx {
    const MAX: usize;

    unsafe fn from_usize_unchecked(idx: usize) -> Self;

    fn index(self) -> usize;

    fn from_usize(idx: usize) -> Self {
        assert!(idx <= Self::MAX);
        // SAFETY: We checked `idx` is valid
        unsafe { Self::from_usize_unchecked(idx) }
    }
}
struct AstNodeId(NonMaxU32);

impl Idx for AstNodeId {
    const MAX: usize = NonMaxU32::MAX.get() as usize;

    unsafe fn from_usize_unchecked(idx: usize) -> Self {
        Self(unsafe { NonMaxU32::new_unchecked(idx as u32) })
    }

    fn index(self) -> usize {
        self.0.get() as usize
    }
}

2. Reimplement push

impl<I: Idx, T> IndexVec<I, T> {
    // `std::vec::Vec`'s minimum
    const MIN_CAPACITY: usize = match std::mem::size_of::<T>() {
        1 => 8,
        s if s <= 1024 => 4,
        _ => 1,
    };
    // Smaller of `I::MAX` items or `isize::MAX` bytes
    const MAX_CAPACITY: usize = min(
        I::MAX,
        isize::MAX as usize / max(std::mem::size_of::<T>(), 1),
    );

    #[inline]
    pub fn push(&mut self, value: T) -> I {
        let len = self.raw.len();
        // This branch is rarely taken
        if len == self.raw.capacity() {
            self.grow_one();
        }
        // Manually add `value` to the `Vec` (code copied from `std::vec::Vec`)
        unsafe {
            let end = self.raw.as_mut_ptr().add(len);
            ptr::write(end, value);
            self.raw.set_len(len + 1);
        }
        // SAFETY: `len` cannot exceed `capacity`, and `capacity` is always `<= I::MAX`
        unsafe { I::from_usize_unchecked(len) }
    }

    #[cold]
    #[inline(never)]
    fn grow_one(&mut self) {
        let current_capacity = self.raw.capacity();
        let new_capacity = (current_capacity * 2).max(Self::MIN_CAPACITY).min(Self::MAX_CAPACITY);
        let additional = new_capacity - current_capacity;

        // Bounds check moved here, in cold branch
        assert!(additional > 0);

        // Avoid `reserve_exact()` doing bounds checks again
        let len = self.raw.len();
        unsafe {
            assert_unchecked!(len == current_capacity);
            assert_unchecked!(len.checked_add(additional).is_some());
        }

        self.raw.reserve_exact(additional);
    }
}

const fn min(n1: usize, n2: usize) -> usize {
    if n1 < n2 { n1 } else { n2 }
}
const fn max(n1: usize, n2: usize) -> usize {
    if n1 > n2 { n1 } else { n2 }
}

3. Prevent capacity ever exceeding I::MAX

Amend all other methods to ensure that capacity can never exceed MAX_CAPACITY.

Note: It's capacity which must not exceed MAX_CAPACITY, not len. This allows if len == self.0.capacity() in push to do double-duty of checking if Vec needs to grow, and ensuring indexes are in bounds, with a single check.

@overlookmotel
Copy link
Author

If we restrict capacity (and therefore also len) to max I::MAX - 1, then we can also remove the bounds check from IndexVec::next_idx, which can then use I::from_usize_unchecked instead of I::from_usize.

@overlookmotel
Copy link
Author

overlookmotel commented Sep 12, 2024

Other methods call I::from_usize and the bounds checks could be removed:

  • IndexVec::iter_enumerated and IndexVec::indices - bounds-check on each iteration
  • IndexVec::first and IndexVec::first_mut (but index is hard-coded 0, so probably bounds-check is optimized out)
  • IndexVec::last and IndexVec::last_mut

(first and last are covered by oxc-project/oxc#5726)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant