Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MIR-borrowck: locals (and thread locals) can remain borrowed after the function ends #45704

Closed
arielb1 opened this issue Nov 1, 2017 · 11 comments
Labels
A-borrow-checker Area: The borrow checker E-mentor Call for participation: This issue has a mentor. Use #t-compiler/help on Zulip for discussion. I-unsound Issue: A soundness hole (worst kind of bug), see: https://en.wikipedia.org/wiki/Soundness
Milestone

Comments

@arielb1
Copy link
Contributor

arielb1 commented Nov 1, 2017

e.g. this does not give a MIR borrowck error:

#![feature(thread_local)]

#[thread_local]
static FOO: u8 = 3;

fn assert_static(_t: &'static u8) {}
fn main() {
     assert_static(&FOO);
}

This is a problem because, as #17954 shows, thread locals end when the current thread ends. AST borrowck currently makes it an error when thread locals are borrowed for more than the current function.

@arielb1 arielb1 added A-borrow-checker Area: The borrow checker I-unsound Issue: A soundness hole (worst kind of bug), see: https://en.wikipedia.org/wiki/Soundness WG-compiler-nll labels Nov 1, 2017
@arielb1 arielb1 changed the title MIR-borrowck: thread locals are allowed to be borrowed for 'static MIR-borrowck: thread locals are unsoundly allowed to be borrowed for 'static Nov 1, 2017
@arielb1
Copy link
Contributor Author

arielb1 commented Nov 1, 2017

The same is true for function arguments (see regions-addr-of-arg), and for locals in the case of a panic where no destructors run:

fn cplusplus_mode(x: isize) -> &'static isize { &x }

fn cplusplus_mode_exceptionally_unsafe(x: &mut Option<&'static mut isize>) {
    let z = 0;
    *x = Some(&mut z);
    panic!("catch me for a dangling pointer!")
}

I'm having all 3 in a single issue because the fix (preventing outstanding borrows to locals at function exit) should catch all 3.

@arielb1 arielb1 changed the title MIR-borrowck: thread locals are unsoundly allowed to be borrowed for 'static MIR-borrowck: locals (and thread locals) can remain borrowed after the function ends Nov 1, 2017
@nikomatsakis
Copy link
Contributor

@arielb1 hmm, what exact fix do you have in mind?

With respect to local variables, I think the problem there is that we don't have a proper storage-dead on the unwinding path, no? That is, would it error if you removed the panic!? My expectation was that it would report an error when we hit the StorageDead (either on unwind or not).

With respect to thread-locals, we don't have a StorageDead, so I guess we need to have a kind of custom check for "when you exit the function, make sure there are no outstanding borrows that have a shallow prefix to a thread-local static".

I could imagine trying not to add storage-dead onto the unwind path, but it seems like that might go wrong, since you might have nested scopes that could cause trouble? (i.e., inner variables getting freed before outer ones?)

@arielb1
Copy link
Contributor Author

arielb1 commented Nov 13, 2017

@nikomatsakis

We know that all locals have their storage killed at the end of the function, so we don't need to represent that explicitly in MIR. With lexical lifetimes, I suppose you could observe the ordering of endregion/storagedead, but with NLL there wouldn't be a difference.

Unless we are planning to deploy MIR borrowck without NLL, that shouldn't be a problem.

@nikomatsakis
Copy link
Contributor

@arielb1 so, to be clear, is your expectation that we will not add StorageDead onto the unwind path at all, but instead we will rely on some kind of "mass dead" check at function exit?

I'm trying to put a finger on what I think that might permit that we might not want to. Given that the checks on StorageDead are a subset of those that occur on DROP, it seems like very little, unless we are talking about structures without destructors -- but those are permitted to have cycles and data that outlives their destruction anyway. So it may be nothing.

@arielb1
Copy link
Contributor Author

arielb1 commented Nov 13, 2017

@nikomatsakis

I think it will allow you to have "NLL-style" cycles if your function doesn't have a destructor. aka:

use std::cell::Cell;

// no destructor
struct S<'a>(Cell<Option<&'a S<'a>>>);
fn main() {
    #[cfg(break)] let _d = Box::new(0);
    let x = S(Cell::new(None));
    let y = S(Cell::new(None));
    x.0.set(Some(&y));
    y.0.set(Some(&x));
    panic!()
}

This currently creates a storagedead -> endregion -> storagedead sequence, which causes a "borrow does not live long enough" error, while either with NLL, or if we have a "mass storagedead at function exit", it should compile. However, with the "None terminator", if we add a destructor in scope, the code will start issuing a borrowck error and therefore be inconsistent.

I think one good way to deal with it consistently, if we want a stable MIR borrowck without NLL, would be to introduce a bit of CFG-based analysis to MIR borrowck: if we find a sequence of storagedead and endregion instructions (possibly connected by gotos across basic blocks) end all regions first and then kill all the storage.

@nikomatsakis
Copy link
Contributor

if we want a stable MIR borrowck without NLL

I am more and more thinking we will not wind up wanting this -- the timing just doesn't seem to be working out that way.

@nikomatsakis
Copy link
Contributor

nikomatsakis commented Nov 13, 2017

Mentoring instructions

Well, whatever we do, I think we all agree that it makes sense as a minimal starting point to have some code in borrow-check such that when you return (or unwind) from a function, we report an error if there are any outstanding loans. In other words, any of the "returning terminators" (listed below) are treated as if they kill the storage for all locals.

The "returning terminators" are those that have no successor, with the exception of Unreachable:

From the POV of borrow-check, we want to treat those basically the same as if they had done a StorageDead for every local, I believe. StorageDead is the MIR statement that frees the stack space for a local variable. Borrow check currently handles a StorageDead like so:

StatementKind::StorageDead(local) => {
self.access_lvalue(ContextKind::StorageDead.new(location),
(&Lvalue::Local(local), span),
(Shallow(None), Write(WriteKind::StorageDead)),
flow_state);
}

As you can see, it invokes access_lvalue with the context of a 'shallow write'. The "write" means that it invalidates all the data in the variable -- the "shallow" means that it only invalidates the data stored contiguously in memory, and not the stuff that is reached through a pointer indirection (notably: not borrowed data referenced by the variable).

When the borrow check encounters one of our returning terminators, it currently does nothing:

TerminatorKind::Goto { target: _ } |
TerminatorKind::Resume |
TerminatorKind::Return |
TerminatorKind::GeneratorDrop |
TerminatorKind::Unreachable |
TerminatorKind::FalseEdges { .. } => {
// no data used, thus irrelevant to borrowck
}

I think we want to separate out the returning terminators from that match, and then have them iterate over through all the locals in the MIR. The local variables in the MIR are stored in the local_decls field. You can iterate through them with something like this:

for local in self.mir.local_decls.indices() {
    // insert call to access_lvalue() we saw before in StorageDead here
}

Once that's done, you'll have to edit the tests.

Also, we'll want to handle thread-locals, but we'll come to that later. (Note to future self: thread-locals will need 'deep' style access checks.)

@nikomatsakis nikomatsakis added the E-mentor Call for participation: This issue has a mentor. Use #t-compiler/help on Zulip for discussion. label Nov 13, 2017
@arielb1
Copy link
Contributor Author

arielb1 commented Nov 13, 2017

@nikomatsakis

You also want to end the borrows for all the local (aka ReScope) regions when performing the check, otherwise you would get errors in code as obvious as

let x = 2;
let y = &x;
panic!()

@nikomatsakis
Copy link
Contributor

Hmm, I left out something important from those instructions. We also want to modify the bit of code that figures out which borrows are in effect. We need to consider the "return" to cancel all the borrows whose duration are local to the function (an ReScope region). This would be modifying the data flow.

But I am now wondering if maybe it's better to iterate over the live borrows and examine them, rather than iterating over all the variables and trying to check if accessing them would be legal.

@nikomatsakis
Copy link
Contributor

Well, be forewarned, this may be one of those things where we try a few different stabs at it before we are happy. =)

@arielb1 arielb1 added this to the NLL prototype milestone Nov 15, 2017
@KiChjang
Copy link
Member

I'm taking a look at this.

bors added a commit that referenced this issue Nov 26, 2017
Kill the storage for all locals on returning terminators

Fixes #45704.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-borrow-checker Area: The borrow checker E-mentor Call for participation: This issue has a mentor. Use #t-compiler/help on Zulip for discussion. I-unsound Issue: A soundness hole (worst kind of bug), see: https://en.wikipedia.org/wiki/Soundness
Projects
None yet
Development

No branches or pull requests

3 participants