Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Underscores in vendor's names #212

Closed
kornelski opened this issue Nov 22, 2017 · 7 comments
Closed

Underscores in vendor's names #212

kornelski opened this issue Nov 22, 2017 · 7 comments

Comments

@kornelski
Copy link

kornelski commented Nov 22, 2017

🚲 ⚠️

It strikes me as odd that the vendor SIMD types and functions have retained leading underscores from C, e.g. __m256i instead of m256i.

I assume the underscores are a wart that exist only because C doesn't have namespaces and needs to be extra paranoid to about conflicts with other symbols. Rust does have proper namespacing. Therefore, could Rust use names without the underscore prefixes?

The names with underscores are quite ugly IMHO, especially that it's an inconsistent mix of 0/1/2 underscores existing depending on historical and implementation-specific trivia.

Names without leading underscores are IMHO cleaner and look like something belonging to stdlib rather than a bolted-on hack. Google and DDG are able to find these names without the underscore prefix.

[I see these names are imported from coresimd so they're no striclty stdsimd's problem, but I suggest renaming them in coresimd if needed too]

@gnzlbg
Copy link
Contributor

gnzlbg commented Nov 22, 2017

For x86 (each architecture is its own story), the vendors are the ones who provides the names for the intrinsics and the types, and we follow those:

  • for the intrinsics names: _mm256_add_epi64
  • the "raw" vector types, that is, when the vector type has no other meaning beyond the raw bits, e.g. _m256i.

When the vector types have some meaning beyond raw bits we urrently use types that reflect that in some cases, e.g., i64x4, but not in others (e.g. i64x4::splat(4) < i64x4::splat(5) does not produce a bool64x4).

Then there are other types like _mmask16 that we are not even considering yet I think.

I think that the only thing that is more or less settle is that we want to preserve the intrinsics names that the vendors provide to help with discoverability.

Since the intrinsic signature then shows users the types, we don't need to do that for the types.

I would be open to switch the raw vector types from _m256i and __m128i and so on to bits128, bits256, bits512 etc.

The boolean vector discussion is happening in #185 .

@BurntSushi
Copy link
Member

BurntSushi commented Nov 22, 2017

Therefore, could Rust use names without the underscore prefixes?

Yes. We could also get rid of the mm prefix. We could also fix weird abbreviations like ps and pd to be something like f32 and f64, respectively. We can change things like cvt to convert so that the behavior is clearer. hadd to horizontal_add, loadu to load_unaligned, slli to shift_left, and on and on.

This is a somewhat unfair response to your query, I admit. Primarily, we would like there to be a straight-forward translation from the vendor names in the C API to the vendor names that we expose. The easiest way to accomplish that goal is to use the names given by the vendor exactly. We could make this translation more complex by adding translation rules. All of the ones I mentioned would fall under that category, although admittedly some (such as your suggestion) are simpler and more obvious than others.

That means a line must be drawn somewhere. Some folks, such as yourself, might like to make the names just a little nicer. Some folks might like to keep things as simple as possible and lobby to not make any changes whatsoever. Some folks might like to be more ambitious and try to make the names more generally sensible. Where you fall on this line is likely influenced by a number of things, including how much a leading underscore bothers you. :-) Me? It doesn't bother me one iota.

Here is my position:

  1. The goal of this crate is to expose vendor defined interfaces. The motivation that inspired those interfaces (e.g., "C and its lack of namespaces") is immaterial to that goal. Therefore, I think we should do exactly that: expose the vendor API as is.
  2. Somewhat contradictorily to (1), I think it is nice to provide more specific integer vector types to the vendor intrinsic API. This is just another angle on "where do we draw the line" spectrum.
  3. I detest bikeshedding and likely won't participate in this thread beyond this comment. I think in cases like this someone just needs to make a decision and stick to it. If it were up to me, I'd stick with the naming scheme we have and move on.

[I see these names are imported from coresimd so they're no striclty stdsimd's problem, but I suggest renaming them in coresimd if needed too]

coresimd is in this repo too.

@gnzlbg
Copy link
Contributor

gnzlbg commented Nov 22, 2017

coresimd is in this repo too.

Off topic: the docs have fallen behind here, I should fix this before the next release. I've opened #213 to track this.

@AdamNiederer
Copy link
Contributor

I would much prefer keeping the names exactly as they're defined by the vendor, simply because it makes mapping their documentation to Rust code a million times easier. Browsing the intel intrinsics guide for an intrinsic to use is already mentally straining, and having to translate the name which shows up in the manual to the name exposed in stdsimd would be too much of a burden IMO.

Apologies for the shameless plug, but if you're looking for something "aesthetically pleasing," that's what I'm trying to do with faster

@kornelski
Copy link
Author

kornelski commented Nov 22, 2017

I'd like to emphasize that I'm not proposing any of the numerous arbitrary changes that @BurntSushi has set up as a strawman argument.

I'm proposing just removing the underscores, which does not introduce any ambiguity and does not complicate searching for documentation. I wouldn't call that mapping mentally straining.

On the contrary, I think it's easier to type and autocomplete names without needing to remember which have how many underscores. And it's __easier to _read __names without __unnecessary __prefixes.

I agree that following vendors (Intel's) naming scheme is important for cross-referencing, but I think names without underscores do that fine, and the underscores being just a C quirk are a spacebar heater.

@gnzlbg
Copy link
Contributor

gnzlbg commented Nov 22, 2017

I am against removing underscores for the vendor APIs because I need to look at the vendors manual and legacy code often and copy paste an intrinsic into rust code.

The only reason I am ok-ish to do so for types is that:

  • this still allows me to copy-paste an intrinsic name 1:1,
  • if I mess up the types rustc will just error telling me which types I must use (so discoverability here is not really an issue),
  • doing this does not add any new "incompatibility" to this library because we are already doing this for most types on x86 anyways.

The types I could find are:

/// 128-bit wide signed integer vector type
#[allow(non_camel_case_types)]
pub type __m128i = ::v128::i8x16;
/// 256-bit wide signed integer vector type
#[allow(non_camel_case_types)]
pub type __m256i = ::v256::i8x32;

and I must say that I find the way they are defined weird. We use them only when we mean "raw bits" but we define them as vectors of i8. If anything, they should be i1x128 and i1x256 which AFAIK LLVM supports; their repr(simd) will be a pain to write down though, in particular the AVX-512's one 😆 . If we had those types, we could just use them instead of having to define the __mXYZi aliases.

@BurntSushi
Copy link
Member

@gnzlbg I'm in favor of debating how those aliases are defined (or even whether they're defined at all or whether they shouldn't be aliases).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants