Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cell Reading with Detray Geometry, main branch (2023.12.19.) #509

Merged
merged 6 commits into from
Mar 6, 2024

Conversation

krasznaa
Copy link
Member

Made the CPU algorithm sequence use Detray geometries. This is needed for processing data produced for the ODD geometry.

To be able to read those files correctly, taught the CSV cell reading code how to switch from Acts to Detry identifiers in the cell module data on the fly. To make the simulation files usable with the identifiers generated by Detray.

This is all very much a hack at the moment, in order to try to make the full (host) chain work on the ODD geometry.

@krasznaa krasznaa added feature New feature or request cpu Changes related to CPU code labels Dec 19, 2023
Copy link
Contributor

@niermann999 niermann999 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In general looks OK

examples/run/cpu/CMakeLists.txt Outdated Show resolved Hide resolved
examples/run/cpu/seq_example.cpp Outdated Show resolved Hide resolved
@krasznaa krasznaa force-pushed the ODDCellReading-main-20231219 branch 2 times, most recently from f4a7d23 to 48a57f3 Compare December 19, 2023 13:15
@beomki-yeo
Copy link
Contributor

Looks good to me as well

@krasznaa krasznaa mentioned this pull request Dec 21, 2023
@krasznaa
Copy link
Member Author

I cherry-picked the 2 commits from #510 to make the ODD geometry work. We'll need to clean things up a bit when we merge these two PRs in. 🤔

@krasznaa krasznaa force-pushed the ODDCellReading-main-20231219 branch from f01de04 to b4903ef Compare March 4, 2024 16:02
@krasznaa
Copy link
Member Author

krasznaa commented Mar 4, 2024

I de-scoped this PR a bit. It no longer tries to execute track finding and fitting. It just teaches the seq_example algorithms to run all the algorithms up to track parameter estimation,on the ODD geometry/data. While still maintaining the ability to run on the good ol' TML files as well.

This works seemingly correctly on the host. But the SYCL and CUDA codes crash during clusterization on the ODD data. 😦 I'll need some help from @stephenswat for debugging it. As my first guess is that the parallelization logic in the clusterization code does not handle a particular ODD events correctly. Since I see this sort of a thing:

[bash][atspot01]:out > ./build/sycl/bin/traccc_seq_example_cuda --input-directory=odd/muon100GeV-geant4/ --detector-file=geometries/odd/odd_geometry_detray.json --digitization-config-file=geometries/odd/odd-digi-geometric-config.json --use-detray-detector --events=1 --skip=0
Running ./build/sycl/bin/traccc_seq_example_cuda geometries/odd/odd_geometry_detray.json odd/muon100GeV-geant4/ 1
WARNING: mask store has empty collection no. 2
WARNING: mask store has empty collection no. 3
WARNING: mask store has empty collection no. 6
WARNING: mask store has empty collection no. 7
WARNING: No material in detector
WARNING: acceleration data structures store has empty collection no. 1
WARNING: acceleration data structures store has empty collection no. 2
WARNING: acceleration data structures store has empty collection no. 3
WARNING: acceleration data structures store has empty collection no. 4
WARNING: No entries in volume finder
Detector check: OK
CUDAassert: an illegal memory access was encountered /home/krasznaa/ATLAS/projects/traccc/traccc/device/cuda/src/utils/stream.cpp 57
[bash][atspot01]:out > ./build/sycl/bin/traccc_seq_example_cuda --input-directory=odd/muon100GeV-geant4/ --detector-file=geometries/odd/odd_geometry_detray.json --digitization-config-file=geometries/odd/odd-digi-geometric-config.json --use-detray-detector --events=1 --skip=1
Running ./build/sycl/bin/traccc_seq_example_cuda geometries/odd/odd_geometry_detray.json odd/muon100GeV-geant4/ 1
WARNING: mask store has empty collection no. 2
WARNING: mask store has empty collection no. 3
WARNING: mask store has empty collection no. 6
WARNING: mask store has empty collection no. 7
WARNING: No material in detector
WARNING: acceleration data structures store has empty collection no. 1
WARNING: acceleration data structures store has empty collection no. 2
WARNING: acceleration data structures store has empty collection no. 3
WARNING: acceleration data structures store has empty collection no. 4
WARNING: No entries in volume finder
Detector check: OK
==> Statistics ... 
- read    81064 cells from 11306 modules
- created (cpu)  0 measurements     
- created (cpu)  0 spacepoints     
- created (cuda) 31923 spacepoints     
- created  (cpu) 0 seeds
- created (cuda) 5829 seeds
==>Elapsed times...
           File reading  (cpu)  209 ms
         Clusterization (cuda)  3 ms
                Seeding (cuda)  7 ms
           Track params (cuda)  1 ms
                     Wall time  222 ms
[bash][atspot01]:out >

With the invalid memory access happening here:

CUDA Exception: Warp Out-of-range Address
The exception was triggered at PC 0x7fffd8e3db30 (ccl_kernel.ipp:67 in traccc::device::fast_sv_1<traccc::cuda::barrier> inlined from ccl_kernel.ipp:243)

Thread 1 "traccc_seq_exam" received signal CUDA_EXCEPTION_5, Warp Out-of-range Address.
[Switching focus to CUDA kernel 0, grid 1, block (50,0,0), thread (49,0,0), device 0, sm 11, warp 6, lane 17]
traccc::cuda::kernels::ccl_kernel<<<(81,1,1),(128,1,1)>>> () at /home/krasznaa/ATLAS/projects/traccc/traccc/device/common/include/traccc/clusterization/device/impl/ccl_kernel.ipp:70 in traccc::device::fast_sv_1<traccc::cuda::barrier> inlined from ccl_kernel.ipp:243
70                          f[f[cid]] = q;
(cuda-gdb) bt
#0  traccc::cuda::kernels::ccl_kernel<<<(81,1,1),(128,1,1)>>> () at /home/krasznaa/ATLAS/projects/traccc/traccc/device/common/include/traccc/clusterization/device/impl/ccl_kernel.ipp:70 in traccc::device::fast_sv_1<traccc::cuda::barrier> inlined from ccl_kernel.ipp:243
(cuda-gdb)

@krasznaa krasznaa marked this pull request as ready for review March 5, 2024 09:03
@krasznaa
Copy link
Member Author

krasznaa commented Mar 5, 2024

After pondering a bit, I now think that we should go ahead with this update. The GPU clusterization code not being able to handle the ODD data is not something for this PR to fix. 🤔

Since reading TML data will continue working after this PR, and ODD reconstruction seems to work on the host, it should make it easier to debug the GPU clusterization code with this included in main. 🤔

This is needed for processing data produced for the ODD geometry.

To be able to read those files correctly, taught the CSV cell
reading code how to switch from Acts to Detry identifiers in the
cell module data on the fly. To make the simulation files usable
with the identifiers generated by Detray.
The test files consistently contain "Acts geometry identifiers".
While the identifier needs to be remapped to the "Detray identifier"
when looking for the surface transforms, we still need the
"Acts identifier" to look up the digitization parameters of
the modules.

This is a bit hacky way of achieving this, but for now it
should be good enough.
Unfortunately no tracks are being found at the moment. :-(
To be debugged.
…amples.

Unfortunately that code does not yet work correctly, so let's not
add it yet.

Made it possible in all seq_example executables to read either the
"old style" geometry description of the TML layout, or to use
detray::detector to read in a geometry (and accompanying data file)
using Detray.

Unfortunately the SYCL and CUDA examples do not work with the ODD
test data. Something with clusterization is going wrong.
@krasznaa krasznaa force-pushed the ODDCellReading-main-20231219 branch 2 times, most recently from 90a8094 to 16997e0 Compare March 6, 2024 13:19
Copy link
Contributor

@niermann999 niermann999 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me

io/src/read_geometry.cpp Show resolved Hide resolved
@krasznaa krasznaa force-pushed the ODDCellReading-main-20231219 branch from 16997e0 to a7c9c31 Compare March 6, 2024 14:23
Changed the API of traccc::io::read_geometry a bit for this,
to make it easier to use Detray based geometries in the algorithm
sequence applications.

Updated all users of traccc::io::read_geometry to use the new API.
@krasznaa krasznaa force-pushed the ODDCellReading-main-20231219 branch from a7c9c31 to a3e1145 Compare March 6, 2024 14:25
@krasznaa krasznaa merged commit 9bd0bab into acts-project:main Mar 6, 2024
16 of 18 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cpu Changes related to CPU code feature New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants