Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

drain_filter_take #3299

Open
SOF3 opened this issue Aug 7, 2022 · 15 comments
Open

drain_filter_take #3299

SOF3 opened this issue Aug 7, 2022 · 15 comments
Labels
T-libs-api Relevant to the library API team, which will review and decide on the RFC.

Comments

@SOF3
Copy link

SOF3 commented Aug 7, 2022

Currently we have the unstable Vec::drain_filter:

pub fn drain_filter<F>(&mut self, filter: F) -> DrainFilter<'_, T, F, A>
where
    F: FnMut(&mut T) -> bool;

My use case is that I want to perform some side effects in filter, where the side effects would determine whether the item should be drained, but performing the side effects requires moving out the T if the element is to be filtered.

I propose a new method drain_filter_take that takes the form

pub fn drain_filter_take<F>(&mut self, filter: F) // does not return an iterator
where
    F: FnMut(T) -> Option<T>;

(I am not sure what a good name would be that clearly indicates it does not return an iterator)

If Some is returned, it replaces the original value; if None is returned, the value is drained. This allows the filter to take ownership of T and only return it if not drained.

@Johan-Mi
Copy link

Johan-Mi commented Aug 12, 2022

I think it would also be useful to be able to drain a vector by mapping a function that takes each item by value and decides whether to yield a transformed version of it or put it back, kind of like filter_map:

pub fn drain_map_ok<U, F>(&mut self, filter: F) -> DrainMapOk<'_, U, T, F, A>
where
    F: FnMut(T) -> Result<U, T>;

So all of the Oks would be iterated while all the Errs would end up in the original vector.

My specific use case for this is extracting all of the literals from a list of expressions in the optimization pass of my compiler.

@SOF3
Copy link
Author

SOF3 commented Aug 12, 2022

@Johan-Mi originally I wanted to call it a drain_filter_map, but turns out that removed items are already moved out in the closure, so it doesn't make sense to return an iterator here.

@SOF3
Copy link
Author

SOF3 commented Aug 12, 2022

oh, I see you're talking about a Result not an Option. But it'd be weird to have it Ok/Err, isn't it? I get your motivation of that mapping is the Ok variant and retaining is the Err variant, but it sounds unnecessarily complex to use.

@Johan-Mi
Copy link

You're right, perhaps it would be clearer with a different return type for the callback, with enum variants named something like Take or PutBack. It is unfortunate to have to deal with putting the values back if you don't want to keep them but Rust doesn't really provide any good way to optionally consume a value.

@the8472
Copy link
Member

the8472 commented Aug 13, 2022

Why not use vec.into_iter().filter_map(...).collect()?

@Johan-Mi
Copy link

@the8472 That consumes the entire original vector instead of only removing certain elements.

@the8472
Copy link
Member

the8472 commented Aug 13, 2022

the collect output contains the remaining elements which you can then continue to use

@Johan-Mi
Copy link

If you choose to collect the remaining elements then you'll lose the other elements—or you could do it the other way around. Either way you lose one half of the input, making it useful only for side effects.

@the8472
Copy link
Member

the8472 commented Aug 13, 2022

The proposed method pub fn drain_filter_take<F>(&mut self, filter: F) // does not return an iterator does discard half of the elements, which is why I suggested the collect approach.

@Johan-Mi
Copy link

I suppose I should turn my proposal into a separate issue, I don't have much experience with RFCs and only added it here since it seemed somewhat related. To clarify, what I'm suggesting is a different method that does return an iterator.

@SOF3
Copy link
Author

SOF3 commented Aug 14, 2022

@the8472 doesn't that reallocate the whole vector in a new buffer?

@nielsle
Copy link

nielsle commented Aug 14, 2022

Hmm... rust vectors are not linked lists, so if you want to take ownership of an element in the middle of a vector an "drain" it out if the vector, then you have to describe how the memory structure if the vector looks afterwards. One idea is to do something like swap_remove.

@Johan-Mi
Copy link

The actual drain internals use some unsafe trickery where they decrease the length of the vector before creating the iterator and then use some raw reads, which seem to be guaranteed to be safe since the iterator has a mutable reference to the vector. I'm not too familiar with the details but it seems like the methods proposed by @SOF3 and I should both be doable with zero allocations.

@the8472
Copy link
Member

the8472 commented Aug 14, 2022

@the8472 doesn't that reallocate the whole vector in a new buffer?

It's not guaranteed, but currently the reallocation is optimized away.

@fogti
Copy link

fogti commented Sep 12, 2022

oh, I see you're talking about a Result not an Option. But it'd be weird to have it Ok/Err, isn't it? I get your motivation of that mapping is the Ok variant and retaining is the Err variant, but it sounds unnecessarily complex to use.

A simple alternative would be just creating an enum with appropriate cases, e.g.

enum FilterMapInnerResult<T, U> {
    /// element gets put back into the vector
    Keep(T),
    /// element gets yielded (and maybe transformed),
    /// and efficiently removed from the vector
    Yield(U),
}

@workingjubilee workingjubilee added the T-libs-api Relevant to the library API team, which will review and decide on the RFC. label Sep 24, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
T-libs-api Relevant to the library API team, which will review and decide on the RFC.
Projects
None yet
Development

No branches or pull requests

6 participants