Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dynamic switch at runtime between SSE2 and AVX2 optim of IDWT 5x3 #961

Open
rouault opened this issue Jun 26, 2017 · 0 comments
Open

Dynamic switch at runtime between SSE2 and AVX2 optim of IDWT 5x3 #961

rouault opened this issue Jun 26, 2017 · 0 comments

Comments

@rouault
Copy link
Collaborator

rouault commented Jun 26, 2017

#957 has a function that can be compiled either with SSE2 or AVX2. It could be desirable to have binaries where the function is compiled in the 2 versions, and depending on the availability of the instruction set, decide which version to use.

Caution: do not forget to add a call to _mm256_zeroupper() after the last use of the AVX/AVX2 instruction set, to avoid transition penalties when returning to SSE code. See https://software.intel.com/en-us/articles/avoiding-avx-sse-transition-penalties
(This is not needed when compiling the code entirely with AVX2 enabled, since SSE code used the VEX mnemonics, which avoid this)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant