Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

UB + couple of mistakes in the string post #1

Closed
GoldsteinE opened this issue Sep 11, 2023 · 4 comments
Closed

UB + couple of mistakes in the string post #1

GoldsteinE opened this issue Sep 11, 2023 · 4 comments

Comments

@GoldsteinE
Copy link

GoldsteinE commented Sep 11, 2023

Hi! I enjoyed reading your latest post about yarn, so I decided to play around with it a little. There’re a couple problems with code as present in the post (I’m not sure about the code in the actual library).

The main problem is that this function:

pub fn owned(data: Box<str>) -> Self {
    let len = data.len();
    let ptr = data.as_ptr().cast_mut();
    mem::forget(data);

    Self {
      raw: RawYarn::from_raw_parts(HEAP, len, ptr),
      _ph: PhantomData,
    }
}

contains UB. Calling mem::forget(data); asserts unique ownership of data and invalidates ptr. That’s a common pitfall when using mem::forget(). The right way to do it is this:

pub fn owned(data: Box<str>) -> Self {
    let data = ManuallyDrop::new(data);
    let len = data.len();
    let ptr = data.as_ptr().cast_mut();

    Self {
        raw: RawYarn::from_raw_parts(HEAP, len, ptr),
        _ph: PhantomData,
    }
}

I’ve also noticed (and fixed) a couple of typos and minor mistakes that prevented the code from compiling. Here’s the playground, with all the relevant changes marked with (!) for easy search: https://play.rust-lang.org/?version=stable&mode=debug&edition=2021&gist=bb572789e965be9ce194889684bf2afa. You can run it under Miri to check that UB is present in the first version and absent in the second.

I’d submit a PR, but I’m not sure where the sources for the posts are.

@mcy
Copy link
Owner

mcy commented Sep 29, 2023

Thanks for providing those typo fixes. I'll apply them to the source for the article.

Miri thinks this 0 is "UB" for stupid reasons that boil down to materializing a &Box<str> as a subexpression. If you rewrite the code as follows Miri does not complain.

  pub fn owned(mut data: Box<str>) -> Self {
    let len = data.len();
    let ptr = data.as_mut_ptr();
    mem::forget(data);

    Self {
      raw: RawYarn::from_raw_parts(HEAP, len, ptr),
      _ph: PhantomData,
    }
  }

Incidentally, this is a problem unique to Box and the compiler's special knowledge of it (primarily a historical accident). Arguably, Miri is incorrect for flagging this, since it works fine for Vec:

fn main() {
    // Works fine (but triggers miri's leaksan, which is not UB).
    let mut data = vec![1];
    let ptr = data.as_ptr().cast_mut();
    std::mem::forget(data);
    unsafe { &mut *ptr; }
    
    // Spurious tagging error.
    let mut data = vec![1].into_boxed_slice();
    let ptr = data.as_ptr().cast_mut();
    std::mem::forget(data);
    unsafe { &mut *ptr; }   // Tagging error occurs here. 
}

This is the sort of weird gotcha that shouldn't be UB, particularly since it's not justified by corresponding potential compiler optimizations.

@mcy mcy closed this as completed Sep 29, 2023
@Kolsky
Copy link

Kolsky commented Sep 29, 2023

Thanks for providing those typo fixes. I'll apply them to the source for the article.

Miri thinks this 0 is "UB" for stupid reasons that boil down to materializing a &Box<str> as a subexpression. If you rewrite the code as follows Miri does not complain.

  pub fn owned(mut data: Box<str>) -> Self {
    let len = data.len();
    let ptr = data.as_mut_ptr();
    mem::forget(data);

    Self {
      raw: RawYarn::from_raw_parts(HEAP, len, ptr),
      _ph: PhantomData,
    }
  }

Incidentally, this is a problem unique to Box and the compiler's special knowledge of it (primarily a historical accident). Arguably, Miri is incorrect for flagging this, since it works fine for Vec:

fn main() {
    // Works fine (but triggers miri's leaksan, which is not UB).
    let mut data = vec![1];
    let ptr = data.as_ptr().cast_mut();
    std::mem::forget(data);
    unsafe { &mut *ptr; }
    
    // Spurious tagging error.
    let mut data = vec![1].into_boxed_slice();
    let ptr = data.as_ptr().cast_mut();
    std::mem::forget(data);
    unsafe { &mut *ptr; }   // Tagging error occurs here. 
}

This is the sort of weird gotcha that shouldn't be UB, particularly since it's not justified by corresponding potential compiler optimizations.

Stop lying please.
https://play.rust-lang.org/?version=stable&mode=debug&edition=2021&gist=8f96ca9dd2d1df0bb4da73a55d355cab
Miri fails as soon as you try using this pointer. Of course nothing happens if you simply discard it.
The model doesn't work as you described. Box does have noalias attribute which means it behaves approximately like &mut T. Moving the unique reference invalidates all nested/child reborrows of said reference. As far as optimizations are concerned, you don't have data at hand to be sure if this is justified decision or not. As for Vec, the code that you wrote which supposedly works isn't technically greenlit by rust-lang devs – there were even discussions whether changing internals to a boxed slice should be done – but in the end too many crates intuitively relied on permitted aliasing implicitly and the decision was postponed. This was never the case with the Box.

See also: rust-lang/unsafe-code-guidelines#326

@GoldsteinE
Copy link
Author

While I disagree with the tone of the previous message, it’s correct in its factual part: move of the Box<_> does invalidate pointers, and that’s not a bug in Miri.

@mcy
Copy link
Owner

mcy commented Sep 30, 2023

What the hell are you talking about? Please take this toxic line of discussion elsewhere instead of trying to mansplain how LLVM works to me.

Repository owner locked as too heated and limited conversation to collaborators Sep 30, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants