Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

No auto-vectorization in such case since 1.54.0 #92193

Open
solotzg opened this issue Dec 22, 2021 · 3 comments
Open

No auto-vectorization in such case since 1.54.0 #92193

solotzg opened this issue Dec 22, 2021 · 3 comments
Labels
A-LLVM Area: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues. C-bug Category: This is a bug. I-slow Issue: Problems and improvements with respect to performance of generated code. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.

Comments

@solotzg
Copy link

solotzg commented Dec 22, 2021

I tried this code:

const MAX_STEP_DATA_SIZE: usize = 32;

#[inline]
fn check(byte: u8) -> bool {
    return (byte <= 0x20) || (byte == 0x22) || (byte == 0x3D) || (byte == 0x5B) || (byte == 0x5D);
    return (byte <= 0x20) | (byte == 0x22) | (byte == 0x3D) | (byte == 0x5B) | (byte == 0x5D);
}

#[inline]
fn is_any(s: &[bool]) -> bool {
    let mut res = false;
    for i in 0..s.len() {
        res |= s[i];
    }
    res
}

unsafe fn test_impl(s: *const u8, len: usize) -> bool {
    let mut step_data = [false; MAX_STEP_DATA_SIZE / std::mem::size_of::<u8>()];
    let mut s = s;
    let mut len = len;
    while len >= step_data.len() {
        for i in 0..step_data.len() {
            step_data[i] = check(*s.add(i));
        }

        if is_any(&step_data) {
            return true;
        }

        s = s.add(step_data.len());
        len -= step_data.len();
    }
    for i in 0..len {
        if check(*s.add(i)) {
            return true;
        }
    }
    return false;
}

pub fn test(s: &[u8]) -> bool {
    unsafe { test_impl(s.as_ptr(), s.len()) }
}

With rustc 1.53.0-nightly
run RUSTFLAGS="--emit=asm -C target-feature=+avx2" cargo build --release
rustc can generate vectorized asm.

But with 1.54.0 or higher, rustc can not make optimization.

It can be verified in godbolt.org as well.

Meta

rustc --version --verbose:

rustc 1.59.0-nightly (e100ec5bc 2021-12-21)
binary: rustc
commit-hash: e100ec5bc7cd768ec17d75448b29c9ab4a39272b
commit-date: 2021-12-21
host: x86_64-unknown-linux-gnu
release: 1.59.0-nightly
LLVM version: 13.0.0
Backtrace

<backtrace>

@solotzg solotzg added the C-bug Category: This is a bug. label Dec 22, 2021
@nikic nikic added A-LLVM Area: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues. I-slow Issue: Problems and improvements with respect to performance of generated code. labels Dec 22, 2021
@AngelicosPhosphoros
Copy link
Contributor

Duplicate of #83623 probably.

@solotzg
Copy link
Author

solotzg commented Dec 24, 2021

Duplicate of #83623 probably.

Maybe.
It can works as expected if I replace || by | in fn check

@workingjubilee
Copy link
Member

Still deeply deoptimized today.

@Noratrieb Noratrieb added the T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. label Apr 5, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-LLVM Area: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues. C-bug Category: This is a bug. I-slow Issue: Problems and improvements with respect to performance of generated code. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.
Projects
None yet
Development

No branches or pull requests

5 participants