Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

mem::swap behaves incorrectly in CTFE (and Miri) #94371

Closed
RalfJung opened this issue Feb 25, 2022 · 11 comments · Fixed by #94527
Closed

mem::swap behaves incorrectly in CTFE (and Miri) #94371

RalfJung opened this issue Feb 25, 2022 · 11 comments · Fixed by #94527

Comments

@RalfJung
Copy link
Member

Since #94212, mem::swap behaves incorrectly during CTFE (where it is unstably accessible) and in Miri. This is demonstrated by the following code failing to build:

#![feature(const_swap)]
#![feature(const_mut_refs)]

use std::borrow::Cow;

pub enum NamePadding {
    PadNone,
    PadOnRight,
}

pub enum TestName {
    StaticTestName(&'static str),
    DynTestName(String),
    AlignedTestName(Cow<'static, str>, NamePadding),
}

pub enum ShouldPanic {
    No,
    Yes,
    YesWithMessage(&'static str),
}

pub enum TestType {
    UnitTest,
    IntegrationTest,
    DocTest,
    Unknown,
}

pub struct TestDesc {
    pub name: TestName,
    pub ignore: bool,
    pub should_panic: ShouldPanic,
    pub compile_fail: bool,
    pub no_run: bool,
    pub test_type: TestType,
}

pub struct Bencher;

pub enum TestFn {
    StaticTestFn(fn()),
    StaticBenchFn(fn(&mut Bencher)),
    DynTestFn(Box<dyn FnOnce() + Send>),
    DynBenchFn(Box<dyn Fn(&mut Bencher) + Send>),
}

pub struct TestDescAndFn {
    pub desc: TestDesc,
    pub testfn: TestFn,
}

pub struct TestId(pub usize);

type T = (TestId, TestDescAndFn);

const C: (T, T) = {
    let mut x = (
        TestId(0),
        TestDescAndFn {
            desc: TestDesc {
                name: TestName::StaticTestName("name"),
                ignore: true,
                should_panic: ShouldPanic::Yes,
                compile_fail: false,
                no_run: false,
                test_type: TestType::UnitTest,
            },
            testfn: TestFn::StaticTestFn(|| {}),
        },
    );
    let mut y = (
        TestId(0),
        TestDescAndFn {
            desc: TestDesc {
                name: TestName::StaticTestName("name"),
                ignore: true,
                should_panic: ShouldPanic::Yes,
                compile_fail: false,
                no_run: false,
                test_type: TestType::UnitTest,
            },
            testfn: TestFn::StaticTestFn(|| {}),
        },
    );
    std::mem::swap(&mut x, &mut y);
    (x, y)
};

fn main() {}

The error is:

error[E0080]: it is undefined behavior to use this value
  --> swap2.rs:57:1
   |
57 | / const C: (T, T) = {
58 | |   let mut x = (
59 | |             TestId(0),
60 | |             TestDescAndFn {
...  |
87 | |   (x, y)
88 | | };
   | |__^ type validation failed at .1.1.desc.name.<enum-tag>: encountered uninitialized bytes, but expected initialized plain (non-pointer) bytes
   |
   = note: The rules on what exactly is undefined behavior aren't clear, so this check might be overzealous. Please open an issue on the rustc repository if you believe it should not be considered undefined behavior.
   = note: the raw bytes of the constant (size: 208, align: 8) {
               0x00 │ 00 00 00 00 00 00 00 00 00 __ __ __ __ __ __ __ │ .........░░░░░░░
               0x10 │ ╾───────alloc14───────╼ 04 00 00 00 00 00 00 00 │ ╾──────╼........
               0x20 │ __ __ __ __ __ __ __ __ __ __ __ __ __ __ __ __ │ ░░░░░░░░░░░░░░░░
               0x30 │ 01 00 00 00 00 00 00 00 __ __ __ __ __ __ __ __ │ ........░░░░░░░░
               0x40 │ __ __ __ __ __ __ __ __ 01 00 00 00 __ __ __ __ │ ░░░░░░░░....░░░░
               0x50 │ 00 00 00 00 00 00 00 00 ╾───────alloc18───────╼ │ ........╾──────╼
               0x60 │ __ __ __ __ __ __ __ __ 00 00 00 00 00 00 00 00 │ ░░░░░░░░........
               0x70 │ __ __ __ __ __ __ __ __ ╾───────alloc4────────╼ │ ░░░░░░░░╾──────╼
               0x80 │ 04 00 00 00 00 00 00 00 __ __ __ __ __ __ __ __ │ ........░░░░░░░░
               0x90 │ __ __ __ __ __ __ __ __ 01 00 00 00 00 00 00 00 │ ░░░░░░░░........
               0xa0 │ __ __ __ __ __ __ __ __ __ __ __ __ __ __ __ __ │ ░░░░░░░░░░░░░░░░
               0xb0 │ __ __ __ __ __ __ __ __ 00 00 00 00 00 00 00 00 │ ░░░░░░░░........
               0xc0 │ ╾───────alloc8────────╼ __ __ __ __ __ __ __ __ │ ╾──────╼░░░░░░░░
           }

The underlying reason for this is #69488: the new swap copies the data in chunks of MaybeUninit<usize>, but this is implemented incorrectly in CTFE and Miri in the sense that if any of the bytes in the source of such a value is uninit, then the entire value becomes uninit after the copy.

This reproduces only with very particular types that hit certain code paths in swap; my attempts at writing a smaller example from scratch failed so instead I extracted this from the test harness where the problem originally occurred in Miri.

Also, even a fix #69488 would not fully resolve the problem: if data contains pointers at unaligned positions, then the usize chunking might attempt to copy a part of a pointer, which CTFE and Miri do not support (that's #87184, and it is a lot harder to fix than #69488). This was in theory already a problem with the previous implementation of swap, but there the chunks were bigger so pointers were less likely to cross the chunk boundary. I don't know if this is ever a problem in practice though. Miri/CTFE will halt when such a partial pointer copy is attempted, so we should know if/when that happens.

@RalfJung
Copy link
Member Author

Cc @rust-lang/wg-const-eval

@scottmcm
Copy link
Member

scottmcm commented Feb 25, 2022

This reproduces only with very particular types that hit certain code paths in swap

Playground miri is on 2-12, so I don't have an easy way to check, but something like this should hit it:

#[repr(C)]
struct Demo(u64, u32, u64, u32, u64, u32, u64);

fn main() {
    let mut x = Demo(1, 2, 3, 4, 5, 6, 7);
    let mut y = Demo(10, 11, 12, 13, 14, 15, 16);
    std::mem::swap(&mut x, &mut y);
}

To get to the code in question, the type needs:

  • size > 4*align
  • align <= sizeof(usize)
  • size is either not a power of two or size > 2*sizeof(usize)
  • align >= alignof(usize)
    • if it doesn't meet this last check it'll get copied byte-by-byte, which I assume would still be a problem for pointers, but not for the undef values check

Basically, it needs to be a type that's big enough to bother, but isn't something special (like an AVX2 vector).


Aside: That error output is beautiful!

@oli-obk
Copy link
Contributor

oli-obk commented Feb 25, 2022

🙈 lots of fun things that can go wrong here. These are not easy to fix, but maybe we can buy us some time by just using const_eval_select to always take the copy_nonoverlapping branch?

@RalfJung
Copy link
Member Author

You mean swap_simple? Yeah if we use that with the original T, we should actually always be good, even with pointers at strange positions. We want this for Miri and CTFE though, so... const_eval_select and cfg(miri) will be needed to do that.

Aside: That error output is beautiful!

Thanks, we spent a lot of work on it. :)

@RalfJung

This comment was marked as resolved.

@RalfJung
Copy link
Member Author

RalfJung commented Feb 25, 2022

Ah, got it:

#![feature(const_swap)]
#![feature(const_mut_refs)]

#[repr(C)]
struct Demo(u64, bool, u64, u32, u64, u64, u64);

const C: (Demo, Demo) = {
    let mut x = Demo(1, true, 3, 4, 5, 6, 7);
    let mut y = Demo(10, false, 12, 13, 14, 15, 16);
    std::mem::swap(&mut x, &mut y);
    (x, y)
};

Looks like my mistake was looking at x after the swap, when it is only y that becomes bugged.

@scottmcm
Copy link
Member

Looks like my mistake was looking at x after the swap, when it is only y that becomes bugged.

Interesting that it's asymmetric. The only asymmetry that exists in the implementation (that I can remember, at least) is

let z = ptr::read(x);
ptr::copy_nonoverlapping(y, x, 1);
ptr::write(y, z);

So does the fact that only one of them errored mean that this might be "fixed" (probably not for the unaligned pointer case) if that was changed from read+copy_nonoverlapping+write to 3×copy_nonoverlapping?

(Though it's probably still better to not use the over-complicated code in CTFE/Miri anyway.)

@RalfJung
Copy link
Member Author

So does the fact that only one of them errored mean that this might be "fixed" (probably not for the unaligned pointer case) if that was changed from read+copy_nonoverlapping+write to 3×copy_nonoverlapping?

Yes. copy_nonoverlaping is an intrinsic and does direct copies of the underlying bytes, without regarding for the type.

@RalfJung
Copy link
Member Author

#94411 indeed fixes the testcase (but the PR still needs more work)

Dylan-DPC added a commit to Dylan-DPC/rust that referenced this issue Feb 27, 2022
…oli-obk

For MIRI, cfg out the swap vectorization logic from 94212

Because of rust-lang#69488 the swap logic from rust-lang#94212 doesn't currently work in MIRI.

Copying in smaller pieces is probably much worse for its performance anyway, so it'd probably rather just use the simple path regardless.

Part of rust-lang#94371, though another PR will be needed for the CTFE aspect.

r? `@oli-obk`
cc `@RalfJung`
bors added a commit to rust-lang-ci/rust that referenced this issue Feb 27, 2022
…i-obk

For MIRI, cfg out the swap vectorization logic from 94212

Because of rust-lang#69488 the swap logic from rust-lang#94212 doesn't currently work in MIRI.

Copying in smaller pieces is probably much worse for its performance anyway, so it'd probably rather just use the simple path regardless.

Part of rust-lang#94371, though another PR will be needed for the CTFE aspect.

r? `@oli-obk`
cc `@RalfJung`
@bors bors closed this as completed in f262ca1 Apr 5, 2022
@RalfJung
Copy link
Member Author

RalfJung commented Apr 5, 2022

With #94527 landed, we might want to revert #94412?
@scottmcm wrote

Copying in smaller pieces is probably much worse for its performance anyway, so it'd probably rather just use the simple path regardless.

but OTOH, it would be valuable to test the actual code path that is used in real code, so without conclusive benchmarks I would default to removing the cfg(miri).

@RalfJung
Copy link
Member Author

RalfJung commented Apr 5, 2022

Oh never mind, Miri would still break on copying data with pointers at 'odd' offsets inside it due to #87184.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
3 participants