EVP 1d threading bit-for-bit #681

apcraig · 2022-01-19T18:58:06Z

Testing OpenMP and addition of omp_suite highlighted an issue in the OpenMP implementation of the evp1d code. The OpenMP in ice_dyn_evp_1d_kernel is not bit-for-bit with different threads. The following test groups should be bit-for-bit but they are not,

smoke gx3 4x4x5x29x40 alt04,reprosum,run10day
smoke gx3 8x2x5x29x40 alt04,reprosum,run10day
smoke gx3 24x1x5x29x40 alt04,reprosum,run10day,thread

or more clearly,

smoke gx3 4x4x5x29x40 evp1d,reprosum,run10day
smoke gx3 8x2x5x29x40 evp1d,reprosum,run10day
smoke gx3 24x1x5x29x40 evp1d,reprosum,run10day,thread

#680 comments out the OMP directives in ice_dyn_evp_1d_kernel. This changes answers for alt04/evp1d but makes the implementation bit-for-bit validated.

Finally, I can confirm that different blocks sizes with MPI only are bit-for-bit with evp1d. The following are all bit-for-bit,

smoke gx3 16x1x5x29x40 alt04,reprosum,run10day,droundrobin
smoke gx3 24x1x5x4x400 alt04,reprosum,run10day,droundrobin
smoke gx3 24x1x5x15x80 alt04,reprosum,run10day,droundrobin

so it's not an issue with the blocks and decompositions, really just the OpenMP in the ice_dyn_evp_1d_kernel.

TillRasmussen · 2022-01-23T18:37:16Z

Blocks are not used within the 1d solver.
I have been able to recreate the bug. It appears already at the first iteration.

TillRasmussen · 2022-02-02T22:57:00Z

The OMP differences are removed on intel compiler by adding -no-vec (no vectorization). This has to do with how the array fits into memory. This may indicate that the other omp loops do not vectorize. @srethmeier please elaborate a bit more.
The "-no-vec" flag could be removed if arrays are written so that they "fit" memory. For double arrays this would require padding to modulus of 4.

The result of the test with the -no-vec turned on is.
d9ea2f412e977d8fa0c1c0b3c871ff7f freya_intel_smoke_gx3_24x1x5x29x40_alt04_reprosum_run10day_thread.novecreal/restart/iced.2005-01-11-00000.nc
d9ea2f412e977d8fa0c1c0b3c871ff7f freya_intel_smoke_gx3_8x2x5x29x40_alt04_reprosum_run10day_thread.novecreal/restart/iced.2005-01-11-00000.nc
d9ea2f412e977d8fa0c1c0b3c871ff7f freya_intel_smoke_gx3_4x4x5x29x40_alt04_reprosum_run10day_thread.novecreal/restart/iced.2005-01-11-00000.nc

TillRasmussen · 2023-01-11T19:12:12Z

@apcraig, @eclare108213 . I dont recall if this was enough for closing this?

apcraig · 2023-11-16T21:24:35Z

I think this is fixed in #895. I tested the decomp suite with -s evp1d on cheyenne and it seemed to be OK.

apcraig added Software Engineering Dynamics labels Jan 19, 2022

apcraig assigned apcraig and TillRasmussen Jan 19, 2022

TillRasmussen assigned apcraig and TillRasmussen and unassigned apcraig and TillRasmussen Jan 20, 2022

TillRasmussen mentioned this issue Feb 3, 2022

Update OMP #680

Merged

16 tasks

apcraig closed this as completed Nov 16, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

EVP 1d threading bit-for-bit #681

EVP 1d threading bit-for-bit #681

apcraig commented Jan 19, 2022

TillRasmussen commented Jan 23, 2022

TillRasmussen commented Feb 2, 2022

TillRasmussen commented Jan 11, 2023

apcraig commented Nov 16, 2023

EVP 1d threading bit-for-bit #681

EVP 1d threading bit-for-bit #681

Comments

apcraig commented Jan 19, 2022

TillRasmussen commented Jan 23, 2022

TillRasmussen commented Feb 2, 2022

TillRasmussen commented Jan 11, 2023

apcraig commented Nov 16, 2023