Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

repr(tag = ...) for type aliases #3659

Open
wants to merge 9 commits into
base: master
Choose a base branch
from
37 changes: 24 additions & 13 deletions text/0000-repr-type-aliases.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@
# Summary
[summary]: #summary

Primitive representations on enums now accept type aliases, meaning that in addition to primitives like `#[repr(u32)]`, `#[repr(core::ffi::c_int)]` and `#[repr(self::my_type)]` are now accepted.
Primitive representations on enums now accept type aliases, meaning that in addition to primitives like `#[repr(u32)]`, `#[repr(type = core::ffi::c_int)]` and `#[repr(type = my_type)]` are now accepted.

# Motivation
[motivation]: #motivation
Expand All @@ -20,25 +20,25 @@ For the same reasons why type aliases are useful, having type aliases in `repr`
# Guide-level explanation
[guide-level-explanation]: #guide-level-explanation

Enums allow `#[repr(type)]` attributes to offer an explicit discriminant type. (`type` can be any primitive integer type, like `u8`, `i32`, or `usize`, but not `char`.) If all variants of the enum are unit variants, this means that the enum will be easily castable to `type` using `as`. Otherwise, the discriminant will still be of the specified type, but unsafe code is required to actually access it.
Enums allow `#[repr(type = ...)]` attributes to offer an explicit discriminant type. (`...` can be any primitive integer type, like `u8`, `i32`, or `usize`, but not `char`.) If all variants of the enum are unit variants, this means that the enum will be easily castable to `type` using `as`. Otherwise, the discriminant will still be of the specified type, but unsafe code is required to actually access it.
clarfonthey marked this conversation as resolved.
Show resolved Hide resolved

In addition to the primitive types themselves, you can also use the path to a type alias in the `repr` attribute instead, and it will resolve the primitive type of the type alias. However, to ensure compatibility as new potential representations are added, the path to the alias must contain a double-colon: you can access an alias `Alias` defined in the same module by using `self::Alias`.

For example, `#[repr(core::ffi::c_int)]` is valid because it contains a double-colon, but a `use core::ffi::c_int` followed by `#[repr(c_int)]` is not. If you wanted to `use core::ffi::c_int` first, then you could still do `#[repr(self::c_int)]` to reference the type.
To ensure compatibility, the `#[repr(type = ...)]` form is required if the type is not one of the known primitive types. Note that this form is not necessarily equivalent to using the primitive representations directly, since shadowing is possible; for example, if you did `type u32 = u8` and then `#[repr(type = u32)]`, this would be equivalent to `#[repr(u8)]`, not `#[repr(u32)]`.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think type u32 = u8 seems needlessly obfuscating; I think it'd be a more readable example to write type C = u8 and then #[repr(type = C)], which is equivalent to #[repr(u8)] rather than $[repr(C)].

Copy link
Contributor Author

@clarfonthey clarfonthey Jun 19, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess that I was intentionally pointing out the obfuscation here because it feels more likely: #[repr(type = C)] is obviously going to mean whatever C type you have, but #[repr(type = u32)] meaning #[repr(u8)] is more likely to occur in something like proc macros if someone is doing something nefarious. So, genuinely, there is a preference to do #[repr(u32)] over #[repr(type = u32)] when you don't necessarily trust the parent scope.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've since updated this section a bit to elaborate a bit better, including what a type alias C might look like. Does this feel like it addresses your concerns?

clarfonthey marked this conversation as resolved.
Show resolved Hide resolved

You can use any type alias in the `repr` attribute, but it *must* be an alias to an accepted primitive type like `u8` or `i32`, and cannot be a pointer, reference, struct, etc.

# Reference-level explanation
[reference-level-explanation]: #reference-level-explanation

The `repr` attribute now accepts arguments containing double-colon tokens, which will be parsed as paths to type aliases to resolve. If those type aliases resolve to a valid primitive type which can be used in the `repr` attribute, that will be used as the actual discriminant representation.
The `repr` attribute now accepts a `type = ...` argument to indicate a resolved path instead of a well-known primitive type. If those the path resolves to a type alias to a valid primitive type which can be used in the `repr` attribute, that will be used as the actual discriminant representation.

An additional, automatically-applicable lint should be added that warns a user if they use `type = ...` for a well-known primitive type, since adding `type = ` instead of using the type directly introduces the possibility of shadowing. (For example, `#[repr(type = u32)]` becomes `#[repr(u32)]`.)
clarfonthey marked this conversation as resolved.
Show resolved Hide resolved

An additional, automatically-applicable lint should be added if a user references a valid type alias in the current scope without including multiple components in the path, recommending to add `self::` to the beginning to ensure forward-compatibility.
Similarly, an automatically-applicable lint should be added that warns a user if a `repr` argument references in an-scope type alias without the `type = ` prefix. (For example, `#[repr(MyType)]` becomes `#[repr(type = MyType)]`.)

# Drawbacks
clarfonthey marked this conversation as resolved.
Show resolved Hide resolved
[drawbacks]: #drawbacks
clarfonthey marked this conversation as resolved.
Show resolved Hide resolved

The requirement for `self::` on already-imported types is unfortunate, but it feels like the best way to ensure that adding new representations isn't a breaking change going forward. Even if we were to decide it weren't a "breaking change," it would still break things anyway, being de-facto breaking.
The requirement for `type =` is unfortunate, but it feels like the best way to ensure that adding new representations isn't a breaking change going forward. Even if we were to decide it weren't a "breaking change," it would still break things anyway, being de-facto breaking.

And, of course, this complicates the compiler. But that's about it.

Expand All @@ -47,13 +47,24 @@ And, of course, this complicates the compiler. But that's about it.

We could always not do this.

But more realistically, here's an alternative design that would avoid the `self::` change, whose complexity feels worse than the `self::` requirement:
But more realistically, here are some alternative designs that were rejected.

## `self::`

We could, instead of using `type =`, require that all types contain a double-colon to indicate they're a path, effectively preventing collisions with arguments that aren't paths. This would require using `self::` for types that are imported in the local scope, and was actually the first proposal of this RFC, but wasn't very well-received.
clarfonthey marked this conversation as resolved.
Show resolved Hide resolved

## Shadowing attributes

Until a future edition, the current set of valid representations could be solidified as taking precedence over any shadowed identifiers. For example, if someone defines `type transparent = u32`, then `repr(transparent)` still means `repr(transparent)` and not `repr(u32)`.

In future editions, we could either:

* Let type aliases shadow all valid representations. This isn't ideal since there is no way to override the shadowing besides nesting your code in a new module and then re-exporting it outside that module, which is very messy.
* Expand the list of unshadowable representations every edition where necessary.

* Until a future edition, the current set of valid representations is solidified as taking precedence over any shadowed identifiers. For example, if someone defines `type transparent = u32`, then `repr(transparent)` still means `repr(transparent)` and not `repr(u32)`.
* At said future edition, type aliases now shadow all valid representations. So, for example, defining `type transparent = u32` would truly mean `repr(u32)`, not `repr(transparent)`. The only way to actually reference `repr(transparent)` would be to not have a type inside the current scope named `transparent`. There could be a deny-by-default warning when people do this.
* Alternatively, you could continue the dance at every future edition, making each new edition define the unshadowable representations allowed.
## Capital letters

Alternatively, you could require that the types start with capital letters- oh, right, `repr(C)` is a thing. It feels like there's not a good way to solve the shadowing problem besides adding in something that will never be a part of future representation names, that is, a double-colon token representing a path.
You could require that the types start with capital letters- oh, right, `repr(C)` is a thing.
clarfonthey marked this conversation as resolved.
Show resolved Hide resolved

# Prior art
[prior-art]: #prior-art
Expand Down