Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add panic-safe slicing methods #1325

Closed
wants to merge 1 commit into from

Conversation

barosl
Copy link
Contributor

@barosl barosl commented Oct 16, 2015

@barosl barosl force-pushed the panic-safe-slicing branch 4 times, most recently from 07936d3 to 5b0e31a Compare October 16, 2015 09:59
@alexcrichton alexcrichton added the T-libs-api Relevant to the library API team, which will review and decide on the RFC. label Oct 16, 2015
@mdinger
Copy link
Contributor

mdinger commented Oct 17, 2015

Is there any reason the originals need panics? Isn't it reasonable to just return an empty slice? Other languages do that. Also, failing when the edges aren't perfect seems silly. I know panic!() or None keeps edge alignment accurate but that last and first statements still seems perfectly valid.

&[..90]  // [1, 2, 3, 4, 5]
&[90..1] // []
&[90..104] // []

@SimonSapin
Copy link
Contributor

Could the existing RangeArgument trait be used instead of Rangeable?

@Gankra
Copy link
Contributor

Gankra commented Oct 19, 2015

@mdinger triggering this behaviour hints at a logic error. Do we have any other cases where the standard library "squashes" inputs in a best-effort way? (nothing off the top of my head)

@mdinger
Copy link
Contributor

mdinger commented Oct 19, 2015

I suppose but it also isn't a strong guarantee at all. These 2 are just as incorrect but only one of each errors:

// Both just as wrong when you want the entire slice but go one too low
&[1..]
&[-1..]
// Both just as wrong when you want up to the last element
&[..len-1]
&[..len+1]

From collections, this isn't an error but returns None:

use std::collections::BTreeMap;
let mut map = BTreeMap::new();
map.insert("foo", 89);

println!("{:?}", map.get("charles"));

So, likewise it'd make sense for a disjoint slice to return a None as well but slice already has a term for that. It's called [] (similarly, (slightly offtopic) it seems indexing off the end should return option: x[inf] return None but x[0] return Some(number) but that'd probably get really painful to work with so...).

I just can't see how panic! could be considered as a valid result for these operations. They're really really easy to recover from.

@mdinger
Copy link
Contributor

mdinger commented Oct 19, 2015

If the slicing operator is supposed to preserve the length of the input in the output, that is &[0..3] -> &[7, 8, 9] but never &[0..3] -> &[7] then the current behaviour would make sense but that is definitely less flexible and weird. I'd almost expect an extra keyword to enforce this rigidity: for example &[0..3].strict().

@seanmonstar
Copy link
Contributor

but never &[0..3] -> &[7] then the current behaviour would make sense but that is definitely less flexible and weird

To add to the subjectivity of what is weird or isn't: I would found it surprising if assert_eq!(&foo[..3].len(), 3) was not true.

@mdinger
Copy link
Contributor

mdinger commented Oct 19, 2015

Okay, then if preserving the size is part of correct slice behaviour then the current behaviour should be fine (I don't have any exact use case anyway) but if the reason for the panics is to strictly to prevent off the end memory access, that seems silly; they should return whatever the valid intersection happens to be.

It's too bad normal slicing can't return option (too unergonomic I guess). This RFC seems like a good idea.

@Gankra
Copy link
Contributor

Gankra commented Oct 20, 2015

This is kind've a fundamental philosophical difference. Some have argued that indexing should behave optimistically and do "modulo len", but I think that's definitely step too far (also incoherent on empty slices). Being strict on all indexing seems consistent to me.

Of course we can't magically detect semantic errors like len-1, but at very least we can catch the instances of "obviously something wrong" and report them to be helpful.

E: within limits. A check must be made, but making extra checks to be "extra helpful" is a harder pill to swallow.

@mdinger
Copy link
Contributor

mdinger commented Oct 20, 2015

Okay. I didn't know it had previously been discussed and it looked kinda like something which had just slipped through unnoticed. Strictness does fit the Rust theme so it isn't too unusual I guess.

@glaebhoerl
Copy link
Contributor

I dunno. For normal indexing where you need to return an &T, there's no reliable way to deal with an out of bounds index, notably on an empty slice, except to panic - you can't just conjure an &T out of thin air. For slicing, returning an &[T], by contrast, there very much is. Whereas with indexing there's nothing else reasonable you could do, here it's just a matter of use cases and expectations: do you expect it to always return a slice of the exact length you requested, and otherwise panic, or do you expect it to take the intersection of the requested slice and the actually-existing one? Both of these seem like perfectly reasonable expectations to me. I think the best way to resolve the question would be to collect data on which behavior is more commonly desired in code-in-the-wild.

@Gankra
Copy link
Contributor

Gankra commented Oct 20, 2015

For all my code, the indices are derived from properties of the to-be-sliced value, so if they're ever messed up it means my code is wrong, and I'd like to know ASAP.

@mdinger
Copy link
Contributor

mdinger commented Oct 20, 2015

Well, indexing returning option would easily handle the out of bounds case (unless it was [] without an inner type but that doesn't seem to be allowed). I don't think it would be easy to measure how desired each style is because it's heavily biased towards the currently implemented style. Also, the time when you'd very much want one over the other may be quite a niche case.

assert_eq!(a.get_range(..3), Some(&a[..3]));
assert_eq!(a.get_range(2..5), Some(&a[2..5]));
assert_eq!(a.get_range(..6), None);
assert_eq!(a.get_range(4..2), None);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

An alternative would be:

assert_eq!(a.get_range(1..), &a[1..]);
assert_eq!(a.get_range(..3), &a[..3]);
assert_eq!(a.get_range(2..5), &a[2..5]);
assert_eq!(a.get_range(..6), &[]);
assert_eq!(a.get_range(4..2), &[]);

@SimonSapin
Copy link
Contributor

For what it’s worth, out of bounds slices are clamped when indexing in Python:

>>> [1, 2, 3][:2]
[1, 2]
>>> [1, 2, 3][:10]
[1, 2, 3]

@seanmonstar
Copy link
Contributor

Other data point, from Java, which in some ways its strictness is similar
Rust's:
http://docs.oracle.com/javase/7/docs/api/java/util/Arrays.html#copyOfRange(T[],%20int,%20int)

If start or end are out of bounds, IndexOutOfBoundsException is thrown.

On Wed, Oct 21, 2015, 1:12 AM Simon Sapin notifications@github.com wrote:

For what it’s worth, out of bounds slices are clamped when indexing in
Python:

[1, 2, 3][:2]
[1, 2]>>> [1, 2, 3][:10]
[1, 2, 3]


Reply to this email directly or view it on GitHub
#1325 (comment).

@bluss
Copy link
Member

bluss commented Oct 22, 2015

Fuzzy slice boundaries are even more strange if you apply the thought experiment to &str (char boundaries need to be respected).

I totally support the idea behind the RFC, nonpanicking slicing would be great. Implementation though, I think a trait named for Index would be better.

@ticki
Copy link
Contributor

ticki commented Oct 29, 2015

👍 We had to implement this in the Redox kernel to avoid kernel panics. We see no reason for this method not existing in libcore.

@ghost
Copy link

ghost commented Dec 2, 2015

I also had to deal with this when writing kernel code. 👍

@Kimundi
Copy link
Member

Kimundi commented Jan 7, 2016

Could't we just make get() as generic as the indexing operators, to allow both .get(n) and .get(x..y)?

@nagisa
Copy link
Member

nagisa commented Jan 7, 2016

@Kimundi breaking change, I think?

@Kimundi
Copy link
Member

Kimundi commented Jan 7, 2016

@nagisa: Not really, if the generic method still accepts the current usize as one possible generic type then its backwards compatible in the same way as a number of changes that already landed.

@Gankra
Copy link
Contributor

Gankra commented Jan 12, 2016

Doing as @Kimundi proposes would of course necessitate making get/get_mut into traits (or: adding adhoc overloading).

There are two problems with this:

  • doc regression because traits suck in docs
  • need to add them to prelude to avoid breakage (OR: add inherent trait impls)

I definitely think it'd be reasonable to have a solution here, I'm just wondering if we're lacking the tooling today to make the solution we'd want longterm, at the moment.

@SimonSapin
Copy link
Contributor

Instead of having get/get_mut be trait methods themselves, they could be inherent methods generic over the type of their parameter (with a trait bound on that parameter) like Vec::drain.

@Kimundi
Copy link
Member

Kimundi commented Jan 17, 2016

Yeah, just making them generic methods is what I had in mind.

@briansmith
Copy link

@Ms2ger
Copy link
Contributor

Ms2ger commented Mar 7, 2016

Seems like a useful addition; reusing get() is nicely symmetric with the slicing syntax.

@alexcrichton
Copy link
Member

@barosl would you be interested in updating this RFC to recommend reusing the existing get method? The libs team discussed this RFC during triage and that seemed like an appealing alternative for us.

@alexcrichton
Copy link
Member

The libs team discussed this RFC recently and the conclusion was that we'd like to move forward, but we've been unable to contact the author. @sfackler has opened an updated version of this RFC over at #1679 which we believe represents the consensus here, so I'm going to close in favor of that.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
T-libs-api Relevant to the library API team, which will review and decide on the RFC.
Projects
None yet
Development

Successfully merging this pull request may close these issues.