Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

perf: Use for_each in Vec::extend #68046

Closed
wants to merge 8 commits into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
85 changes: 81 additions & 4 deletions src/liballoc/benches/vec.rs
Original file line number Diff line number Diff line change
@@ -1,4 +1,7 @@
use std::iter::{repeat, FromIterator};
use std::{
hint::black_box,
iter::{repeat, FromIterator},
};
use test::Bencher;

#[bench]
Expand Down Expand Up @@ -165,14 +168,53 @@ fn bench_from_iter_1000(b: &mut Bencher) {
}

fn do_bench_extend(b: &mut Bencher, dst_len: usize, src_len: usize) {
let dst: Vec<_> = FromIterator::from_iter(0..dst_len);
let src: Vec<_> = FromIterator::from_iter(dst_len..dst_len + src_len);
do_bench_extend_iter(b, dst_len, src_len, src)
}

fn do_bench_extend_chain(b: &mut Bencher, dst_len: usize, src_len: usize) {
let src = Vec::from_iter(dst_len..dst_len + src_len);
let x1;
let x2;
let x3;
let x4;
let x5;

if src.len() / 5 == 0 {
x1 = &[][..];
x2 = &[][..];
x3 = &[][..];
x4 = &[][..];
x5 = &[][..];
} else {
let mut iter = src.chunks_exact(src.len() / 5);
x1 = iter.next().unwrap_or(&[][..]);
x2 = iter.next().unwrap_or(&[][..]);
x3 = iter.next().unwrap_or(&[][..]);
x4 = iter.next().unwrap_or(&[][..]);
x5 = iter.next().unwrap_or(&[][..]);
}
let src =
x1.iter().cloned().chain(x2.iter().cloned().chain(x3.iter().cloned())).chain(
Some(()).into_iter().flat_map(|()| x4.iter().cloned().chain(x5.iter().cloned())),
);
do_bench_extend_iter(b, dst_len, src_len, src)
}

#[inline(never)]
fn do_bench_extend_iter(
b: &mut Bencher,
dst_len: usize,
src_len: usize,
src: impl IntoIterator<Item = usize> + Clone,
) {
let dst: Vec<_> = FromIterator::from_iter(0..dst_len);

b.bytes = src_len as u64;

b.iter(|| {
let mut dst = dst.clone();
dst.extend(src.clone());
let mut dst = black_box(dst.clone());
dst.extend(black_box(src.clone()));
assert_eq!(dst.len(), dst_len + src_len);
assert!(dst.iter().enumerate().all(|(i, x)| i == *x));
});
Expand Down Expand Up @@ -213,6 +255,41 @@ fn bench_extend_1000_1000(b: &mut Bencher) {
do_bench_extend(b, 1000, 1000)
}

#[bench]
fn bench_extend_chain_0000_0000(b: &mut Bencher) {
do_bench_extend_chain(b, 0, 0)
}

#[bench]
fn bench_extend_chain_0000_0010(b: &mut Bencher) {
do_bench_extend_chain(b, 0, 10)
}

#[bench]
fn bench_extend_chain_0000_0100(b: &mut Bencher) {
do_bench_extend_chain(b, 0, 100)
}

#[bench]
fn bench_extend_chain_0000_1000(b: &mut Bencher) {
do_bench_extend_chain(b, 0, 1000)
}

#[bench]
fn bench_extend_chain_0010_0010(b: &mut Bencher) {
do_bench_extend_chain(b, 10, 10)
}

#[bench]
fn bench_extend_chain_0100_0100(b: &mut Bencher) {
do_bench_extend_chain(b, 100, 100)
}

#[bench]
fn bench_extend_chain_1000_1000(b: &mut Bencher) {
do_bench_extend_chain(b, 1000, 1000)
}

fn do_bench_push_all(b: &mut Bencher, dst_len: usize, src_len: usize) {
let dst: Vec<_> = FromIterator::from_iter(0..dst_len);
let src: Vec<_> = FromIterator::from_iter(dst_len..dst_len + src_len);
Expand Down
49 changes: 36 additions & 13 deletions src/liballoc/vec.rs
Original file line number Diff line number Diff line change
Expand Up @@ -1559,8 +1559,8 @@ impl<T: Default> Vec<T> {
#[unstable(feature = "vec_resize_default", issue = "41758")]
#[rustc_deprecated(
reason = "This is moving towards being removed in favor \
of `.resize_with(Default::default)`. If you disagree, please comment \
in the tracking issue.",
of `.resize_with(Default::default)`. If you disagree, please comment \
in the tracking issue.",
since = "1.33.0"
)]
pub fn resize_default(&mut self, new_len: usize) {
Expand Down Expand Up @@ -2121,7 +2121,37 @@ where
}
}

fn extend_fold<T>(self_: &mut Vec<T>) -> impl FnMut((), T) -> Result<(), T> + '_ {
move |(), element| {
if self_.len() < self_.capacity() {
unsafe {
self_.push_unchecked(element);
}
Ok(())
} else {
Err(element)
}
}
}

impl<T> Vec<T> {
unsafe fn push_unchecked(&mut self, value: T) {
let end = self.as_mut_ptr().add(self.len);
ptr::write(end, value);
// NB can't overflow since we would have had to alloc the address space
self.len += 1;
}

#[cold]
#[inline(never)]
fn extend_desugared_fallback<I: Iterator<Item = T>>(&mut self, iterator: &I, element: T) {
let (lower, _) = iterator.size_hint();
self.reserve(lower.saturating_add(1));
unsafe {
self.push_unchecked(element);
}
}

fn extend_desugared<I: Iterator<Item = T>>(&mut self, mut iterator: I) {
// This is the case for a general iterator.
//
Expand All @@ -2130,17 +2160,10 @@ impl<T> Vec<T> {
// for item in iterator {
// self.push(item);
// }
while let Some(element) = iterator.next() {
let len = self.len();
if len == self.capacity() {
let (lower, _) = iterator.size_hint();
self.reserve(lower.saturating_add(1));
}
unsafe {
ptr::write(self.get_unchecked_mut(len), element);
// NB can't overflow since we would have had to alloc the address space
self.set_len(len + 1);
}
let (lower, _) = iterator.size_hint();
self.reserve(lower);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This still has the issue of trying to reserve before advancing the iterator. taking an element and then checking the hint and capacity is strictly better because the hint will be more accurate and if you take a None you can skip the hint calculation and reserve attempt.

Copy link
Contributor Author

@Marwes Marwes Jan 22, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The benefit is that it avoids calling next at all. While the hint should be more accurate by calling next it would also be even more accurate if we called next twice so I don't quite buy that argument. Calling next once would give a fast path for the empty iterator at least which I find a more convincing argument, however it is only a benefit if size_hint(); reserve(0) is slow enough to warrant this fast path.

Pushed another variant which does next first which should optimize better for the empty iterator. Haven't got time to benchmark it yet.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was more concerned about doing an suboptimal resize if we got an incorrect lower bound on the first attempt, but you're right, if it's just 0 it hopefully would be cheap enough and the next reserve will do it properly.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have gone a bit back and forth but I think the of reserving eagerly has a better tradeoff.

On size_hint == 0 it does call reserve unnecessarily but it is only one more branch in an already fast case. If size_hint > 0 then most of the time we will reserve either way and most of the time with the same value (size_hint_minus_removed_value + 1 vs size_hint).

The only case where I'd expect it to be faster to call next first is if size_hint is expensive, despite the iterator being empty and next being faster to call than size_hint which seems unlikely (happy to be corrected however!).

while let Err(element) = iterator.try_fold((), extend_fold(self)) {
self.extend_desugared_fallback(&iterator, element);
}
}

Expand Down