Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bounds check is not eliminated #356

Open
SeeSpring opened this issue Jul 10, 2023 · 5 comments
Open

Bounds check is not eliminated #356

SeeSpring opened this issue Jul 10, 2023 · 5 comments
Labels
C-bug Category: Bug

Comments

@SeeSpring
Copy link

I tried this code:
Godbolt

#![feature(portable_simd)]
use std::simd::*;
pub fn increment(idx: Simd<u32, 8>, arr: &mut [u32]) {
    assert!(
        (arr.len() >= u32::MAX as usize)
            || idx.simd_lt(Simd::splat(arr.len() as u32)).all()
    );
    idx.as_array().iter().for_each(|i| arr[*i as usize] += 1);
}

I expected to see this happen: Bounds checks are eliminated and there is the vpcmpltd instruction

Instead, this happened: Bounds checks are not eliminated and a vpmaxud and vpcmpeqd pair is used

Meta

RUSTC_BOOTSTRAP=1 rustc --version --verbose:

rustc 1.70.0 (90c541806 2023-05-31)
binary: rustc
commit-hash: 90c541806f23a127002de5b4038be731ba1458ca
commit-date: 2023-05-31
host: x86_64-unknown-linux-gnu
release: 1.70.0
LLVM version: 16.0.2
@SeeSpring SeeSpring added the C-bug Category: Bug label Jul 10, 2023
@SeeSpring
Copy link
Author

Also vmovmskps and test should probably just be vtestps

@calebzulawski
Copy link
Member

It appears that on latest Rust, the codegen is as expected, using vcmp + test: godbolt

@SeeSpring
Copy link
Author

There are still the bounds checks in the disassembly

.LBB0_2:
        mov     ecx, dword ptr [rax]
        mov     edi, ecx
        cmp     rdi, rdx
        jae     .LBB0_12
        inc     dword ptr [rsi + 4*rdi]
        mov     edi, dword ptr [rax + 4]
        cmp     rdi, rdx
        jae     .LBB0_12
        inc     dword ptr [rsi + 4*rdi]
        mov     edi, dword ptr [rax + 8]
        cmp     rdi, rdx
        jae     .LBB0_12
        inc     dword ptr [rsi + 4*rdi]
        mov     edi, dword ptr [rax + 12]
        cmp     rdi, rdx
        jae     .LBB0_12
        inc     dword ptr [rsi + 4*rdi]
        mov     edi, dword ptr [rax + 16]
        cmp     rdi, rdx
        jae     .LBB0_12
        inc     dword ptr [rsi + 4*rdi]
        mov     edi, dword ptr [rax + 20]
        cmp     rdi, rdx
        jae     .LBB0_12
        inc     dword ptr [rsi + 4*rdi]
        mov     edi, dword ptr [rax + 24]
        cmp     rdi, rdx
        jae     .LBB0_12
        inc     dword ptr [rsi + 4*rdi]
        mov     edi, dword ptr [rax + 28]
        cmp     rdi, rdx
        jae     .LBB0_12
        inc     dword ptr [rsi + 4*rdi]
        pop     rax
        vzeroupper
        ret

When using unsafe { idx.as_array().iter().for_each(|i| *arr.get_unchecked_mut(*i as usize) += 1); }

.LBB0_3:
        vmovd   eax, xmm0
        inc     dword ptr [rsi + 4*rax]
        vpextrd eax, xmm0, 1
        inc     dword ptr [rsi + 4*rax]
        vpextrd eax, xmm0, 2
        inc     dword ptr [rsi + 4*rax]
        vpextrd eax, xmm0, 3
        inc     dword ptr [rsi + 4*rax]
        vextracti128    xmm0, ymm0, 1
        vmovd   eax, xmm0
        inc     dword ptr [rsi + 4*rax]
        vpextrd eax, xmm0, 1
        inc     dword ptr [rsi + 4*rax]
        vpextrd eax, xmm0, 2
        inc     dword ptr [rsi + 4*rax]
        vpextrd eax, xmm0, 3
        inc     dword ptr [rsi + 4*rax]
        pop     rax
        vzeroupper
        ret

@calebzulawski
Copy link
Member

I don't think that's particularly SIMD-related, that's [u32]'s Index trait. I understand technically that assert proves removing the check is allowed, but that's a relatively complex relationship that I'm not sure LLVM can handle.

@calebzulawski
Copy link
Member

calebzulawski commented Feb 17, 2024

As a comparison, the trivial scalar implementation doesn't remove the bound checks either: godbolt

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C-bug Category: Bug
Projects
None yet
Development

No branches or pull requests

2 participants