Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RepeatN: use MaybeUninit #130145

Merged
merged 1 commit into from
Sep 17, 2024
Merged

RepeatN: use MaybeUninit #130145

merged 1 commit into from
Sep 17, 2024

Conversation

fee1-dead
Copy link
Member

@fee1-dead fee1-dead commented Sep 9, 2024

Closes #130140. Closes #130141.

Use MaybeUninit instead of ManuallyDrop for soundness.

@rustbot
Copy link
Collaborator

rustbot commented Sep 9, 2024

r? @Mark-Simulacrum

rustbot has assigned @Mark-Simulacrum.
They will have a look at your PR within the next two weeks and either review your PR or reassign to another reviewer.

Use r? to explicitly pick a reviewer

@rustbot rustbot added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-libs Relevant to the library team, which will review and decide on the PR/issue. labels Sep 9, 2024
@fee1-dead
Copy link
Member Author

Note that this changes the Debug output, the element field will now be wrapped inside a Some if count is at least 1, and will display as None otherwise. Previously it just printed whatever is in there (which is unsound)


#[test]
fn test_repeat_n_soundness() {
let x = std::iter::repeat_n(String::from("use after free"), 0);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also test clone please 🤔

r=me afterwards

@rust-log-analyzer

This comment has been minimized.

@lcnr
Copy link
Contributor

lcnr commented Sep 9, 2024

@bors r+ rollup=never

@bors
Copy link
Contributor

bors commented Sep 9, 2024

📌 Commit 40b6f96 has been approved by lcnr

It is now in the queue for this repository.

@bors bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Sep 9, 2024
@lcnr
Copy link
Contributor

lcnr commented Sep 9, 2024

cc @rust-lang/libs

@rust-log-analyzer

This comment has been minimized.

let mut y = std::iter::repeat_n(Box::new(0), 1);
let x = y.next().unwrap();
let _z = y;
assert_eq(0, *x);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
assert_eq(0, *x);
assert_eq!(0, *x);

@fee1-dead
Copy link
Member Author

oops

@fee1-dead
Copy link
Member Author

@bors r-

@bors bors added S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. and removed S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. labels Sep 9, 2024
Copy link
Member

@WaffleLapkin WaffleLapkin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the use of ManuallyDrop was correct here (ignoring the UAF issue obviously), however before rust-lang/unsafe-code-guidelines#245 is fixed, it makes sense to use ManuallyDrop instead.

Can you leave a FIXME to replace it back with ManuallyDrop, once it doesn't cause issues?

@antonilol
Copy link
Contributor

This iterator could be implemented using minimal to no unsafe code (I used none in this example) like this:

use core::num::NonZeroUsize;

#[derive(Debug, Clone)]
pub struct RepeatN<T>(Option<(T, NonZeroUsize)>);

impl<T> RepeatN<T> {
    pub fn new(value: T, count: usize) -> Self {
        Self(NonZeroUsize::new(count).map(|count| (value, count)))
    }

    // for use in last and maybe other methods
    fn take_element(&mut self) -> Option<T> {
        self.0.take().map(|(value, _count)| value)
    }
}

impl<T: Clone> Iterator for RepeatN<T> {
    type Item = T;

    fn next(&mut self) -> Option<Self::Item> {
        let (value, count) = self.0.as_mut()?;
        if let Some(new_count) = NonZeroUsize::new(count.get() - 1) {
            *count = new_count;
            Some(value.clone())
        } else {
            // this could be unwrap_unchecked
            Some(self.0.take().unwrap().0)
        }
    }

    fn size_hint(&self) -> (usize, Option<usize>) {
        let len = self.0.as_ref().map_or(0, |(_value, count)| count.get());
        (len, Some(len))
    }

    ...
}

@jwong101
Copy link
Contributor

jwong101 commented Sep 9, 2024

That also has the advantage of interacting with dropck a bit better(although you could also use may_dangle):

fn src() -> String {
    // fails today, but shouldn't if you use the option approach
    let x = "foo".to_owned();
    let mut y = std::iter::repeat_n(&*x, 1);
    // use y, but don't consume it entirely
    x
}

@antonilol
Copy link
Contributor

Does this also preserve niches from T? MaybeUninit for sure not.

#[stable(feature = "iter_repeat_n", since = "1.82.0")]
pub struct RepeatN<A> {
count: usize,
// Invariant: has been dropped iff count == 0.
element: ManuallyDrop<A>,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FWIW, it's intentional that this was ManuallyDrop and not MaybeUninit, because there's no case in which this is constructed without a value, and thus we can preserve niches for it.

Does this actually need to change to MaybeUninit? Feels to me like the fix is to have Clone do a bitwise copy if count == 0 -- which will have the nice behaviour of eliding the branch for T: Copy.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, ManuallyDrop currently doesn't work here, see #130141 and #130145 (review).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does that mean that ManuallyDrop<Box<_>> is just never sound? Do we need #[lang = "manually_drop"] to remove that requirement?

Copy link
Member

@WaffleLapkin WaffleLapkin Sep 10, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does that mean that ManuallyDrop<Box<_>> is just never sound?

It's sound if you don't touch it in any way after running Box's drop I think (i.e. not even move). But generally yes.

To fix this we need to implement MaybeDangling (#118166) which would be a lang item, yes (and then wrap the only field of ManuallyDrop with MaybeDangling).

@scottmcm
Copy link
Member

scottmcm commented Sep 9, 2024

To @antonilol's point, the only reason this was using unsafe code was to preserve niches, and thus to allow the implementation for Copy types to always just read the payload without caring about the count.

If this is going to use MaybeUninit -- and thus not get noundef in LLVM and such -- then it should just use the safe implementation. The safe implementation is just as efficient as the MaybeUninit version.

@cuviper
Copy link
Member

cuviper commented Sep 9, 2024

Option<(T, NonZeroUsize)> would have slightly less niche space than (ManuallyDrop<T>, usize), because the current layout implementation only propagates the largest niche.

  • If T has no niche values, then the single niche in NonZeroUsize will be used by Option::None. This is the same as before with no niches available outside.
  • If T has one niche value, then None may use that or the NonZeroUsize again, but the other will not be available. So this is a lost niche compared to before.
  • If T has multiple niche values, then None will use one of those, and the rest will be available outside. The NonZeroUsize niche will not be available. This is one less niche compared to before.

@jwong101
Copy link
Contributor

jwong101 commented Sep 9, 2024

Tbh, it's probably best if the NonZero field is used as the niche for the option, as that allows just loading that field for stuff like size_hint.

use std::num::NonZero;

type T = Option<(String, NonZero<usize>)>;

#[no_mangle]
pub fn src(x: &T) -> usize {
    x.as_ref().map_or(0, |a| a.1.get())
}

type U = Option<(NonZero<usize>, String)>;

#[no_mangle]
pub fn dst(x: &U) -> usize {
    x.as_ref().map_or(0, |a| a.0.get())
}

#[no_mangle]
pub fn good(x: &Option<(NonZero<usize>, std::os::fd::OwnedFd)>) -> usize {
    x.as_ref().map_or(0, |a| a.0.get())
}

#[no_mangle]
pub fn bad(x: &Option<(std::os::fd::OwnedFd, NonZero<usize>)>) -> usize {
    x.as_ref().map_or(0, |a| a.1.get())
}
src:                                    # @src
# %bb.0:
	xor	eax, eax
	cmp	rax, qword ptr [rdi]
	jo	.LBB0_2
# %bb.1:
	mov	rax, qword ptr [rdi + 24]

.LBB0_2:
	ret
                                        # -- End function

dst:                                    # @dst
# %bb.0:
	xor	eax, eax
	cmp	rax, qword ptr [rdi + 8]
	jo	.LBB1_2
# %bb.1:
	mov	rax, qword ptr [rdi]

.LBB1_2:
	ret
                                        # -- End function

good:                                   # @good
# %bb.0:
	mov	rax, qword ptr [rdi]
	ret
                                        # -- End function

bad:                                    # @bad
# %bb.0:
	cmp	dword ptr [rdi], -1
	je	.LBB3_1
# %bb.2:
	mov	rax, qword ptr [rdi + 8]
	ret

.LBB3_1:
	xor	eax, eax
	ret

The loss of the other niches mainly hurts the size of it being wrapped in iterator adapters that would wrap it in an option like chain. Though I think the main regression would be losing noundef, and Rust potentially choosing a suboptimal niche that doesn't overlap with the NonZero<size>.

@antonilol
Copy link
Contributor

Tbh, it's probably best if the NonZero field is used as the niche for the option, as that allows just loading that field for stuff like size_hint.

If the compiler doesn't know the original size, that could result in this extra branch, but if it does know, it is smart enough to figure it out (first 3 examples, compiler explorer link)

https://rust.godbolt.org/z/W8hqncT81

(How does this even produce 10x more code for 'len_5' compared to 'len_4'? That feels very counterintuitive.)

Comment on lines +100 to +103
let element = mem::replace(&mut self.element, MaybeUninit::uninit());
// SAFETY: We just set count to zero so it won't be dropped again,
// and it used to be non-zero so it hasn't already been dropped.
unsafe { Some(ManuallyDrop::take(&mut self.element)) }
unsafe { Some(element.assume_init()) }
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this the same as returning Some(self.element.assume_init_read())?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm. not too sure about this.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it is, but there could be a subtle difference between the two methods.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Writing uninit() into self.element will likely be elided in release mode, but explicitly writing uninit() into self.element would let sanitizers (e.g. Miri) explicitly catch if the field is later read as initialized, even if T: Copy.

Compare these two functions: https://rust.godbolt.org/z/cK48E76z3 At -Copt-level=1, they are almost the same (notably the mem::replace(&mut self.element, uninit()) does not actually write anything to self.element), and at -Copt-level=2 or above, the two functions are compiled to the same code.

@fee1-dead
Copy link
Member Author

I'd like a definite answer on what approach I should take (e.g. fixing the use-after-free first by not changing the field type and writing manual impls but keeping the miri reported UB due to using ManuallyDrop?)

@fee1-dead fee1-dead added S-waiting-on-team Status: Awaiting decision from the relevant subteam (see the T-<team> label). and removed S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. labels Sep 17, 2024
@Noratrieb
Copy link
Member

@fee1-dead fee1-dead added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. and removed S-waiting-on-team Status: Awaiting decision from the relevant subteam (see the T-<team> label). labels Sep 17, 2024
@workingjubilee
Copy link
Member

It's fun to fuss over an unsoundness issue that is also an optimization problem, but it's better to actually ship the fix. This code seems adequate to quell the soundness issue, we can fuss over improvements later.

@bors r=lcnr,workingjubilee

@bors
Copy link
Contributor

bors commented Sep 17, 2024

📌 Commit 4c8b84a has been approved by lcnr,workingjubilee

It is now in the queue for this repository.

@bors bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Sep 17, 2024
@bors
Copy link
Contributor

bors commented Sep 17, 2024

⌛ Testing commit 4c8b84a with merge 2e367d9...

@bors
Copy link
Contributor

bors commented Sep 17, 2024

☀️ Test successful - checks-actions
Approved by: lcnr,workingjubilee
Pushing 2e367d9 to master...

@bors bors added the merged-by-bors This PR was explicitly merged by bors. label Sep 17, 2024
@bors bors merged commit 2e367d9 into rust-lang:master Sep 17, 2024
7 checks passed
@rustbot rustbot added this to the 1.83.0 milestone Sep 17, 2024
@rust-timer
Copy link
Collaborator

Finished benchmarking commit (2e367d9): comparison URL.

Overall result: ❌ regressions - no action needed

@rustbot label: -perf-regression

Instruction count

This is a highly reliable metric that was used to determine the overall result at the top of this comment.

mean range count
Regressions ❌
(primary)
- - 0
Regressions ❌
(secondary)
0.3% [0.3%, 0.3%] 1
Improvements ✅
(primary)
- - 0
Improvements ✅
(secondary)
- - 0
All ❌✅ (primary) - - 0

Max RSS (memory usage)

This benchmark run did not return any relevant results for this metric.

Cycles

This benchmark run did not return any relevant results for this metric.

Binary size

This benchmark run did not return any relevant results for this metric.

Bootstrap: 759.355s -> 759.624s (0.04%)
Artifact size: 341.37 MiB -> 341.31 MiB (-0.02%)

@cuviper
Copy link
Member

cuviper commented Sep 18, 2024

Since this is stabilizing in 1.82:

@rustbot label beta-nominated

@rustbot rustbot added the beta-nominated Nominated for backporting to the compiler in the beta channel. label Sep 18, 2024
@Amanieu Amanieu added the beta-accepted Accepted for backporting to the compiler in the beta channel. label Sep 18, 2024
@cuviper cuviper mentioned this pull request Sep 19, 2024
@cuviper cuviper modified the milestones: 1.83.0, 1.82.0 Sep 19, 2024
@cuviper cuviper removed the beta-nominated Nominated for backporting to the compiler in the beta channel. label Sep 19, 2024
bors added a commit to rust-lang-ci/rust that referenced this pull request Sep 20, 2024
[beta] backports

- Don't warn empty branches unreachable for now rust-lang#129103
- Win: Add dbghelp to the list of import libraries rust-lang#130047
- `RepeatN`: use MaybeUninit rust-lang#130145
- Update LLVM to 19 327ca6c rust-lang#130212
- Revert rust-lang#129749 to fix segfault in LLVM rust-lang#130477
- Check params for unsafety in THIR rust-lang#130531

r? cuviper
bors added a commit to rust-lang-ci/rust that referenced this pull request Sep 20, 2024
[beta] backports

- Don't warn empty branches unreachable for now rust-lang#129103
- Win: Add dbghelp to the list of import libraries rust-lang#130047
- `RepeatN`: use MaybeUninit rust-lang#130145
- Update LLVM to 19 327ca6c rust-lang#130212
- Revert rust-lang#129749 to fix segfault in LLVM rust-lang#130477
- Check params for unsafety in THIR rust-lang#130531

r? cuviper
try-job: dist-various-1
bors added a commit to rust-lang-ci/rust that referenced this pull request Sep 21, 2024
[beta] backports

- Don't warn empty branches unreachable for now rust-lang#129103
- Win: Add dbghelp to the list of import libraries rust-lang#130047
- `RepeatN`: use MaybeUninit rust-lang#130145
- Update LLVM to 19 327ca6c rust-lang#130212
- Revert rust-lang#129749 to fix segfault in LLVM rust-lang#130477
- Check params for unsafety in THIR rust-lang#130531

r? cuviper
try-job: dist-various-1
bors added a commit to rust-lang-ci/rust that referenced this pull request Sep 21, 2024
[beta] backports

- Don't warn empty branches unreachable for now rust-lang#129103
- Win: Add dbghelp to the list of import libraries rust-lang#130047
- `RepeatN`: use MaybeUninit rust-lang#130145
- Update LLVM to 19 327ca6c rust-lang#130212
- Revert rust-lang#129749 to fix segfault in LLVM rust-lang#130477
- Check params for unsafety in THIR rust-lang#130531

r? cuviper
try-job: dist-various-1
bors added a commit to rust-lang-ci/rust that referenced this pull request Sep 21, 2024
[beta] backports

- Don't warn empty branches unreachable for now rust-lang#129103
- Win: Add dbghelp to the list of import libraries rust-lang#130047
- `RepeatN`: use MaybeUninit rust-lang#130145
- Update LLVM to 19 327ca6c rust-lang#130212
- Revert rust-lang#129749 to fix segfault in LLVM rust-lang#130477
- Check params for unsafety in THIR rust-lang#130531

r? cuviper
try-job: dist-various-1
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
beta-accepted Accepted for backporting to the compiler in the beta channel. merged-by-bors This PR was explicitly merged by bors. S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. T-libs Relevant to the library team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

core::iter::repeat_n is unsound with Box<T> Use after free in core::iter::repeat_n