Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cell Reading Rewrite, main branch (2024.04.19.) #547

Merged

Conversation

krasznaa
Copy link
Member

This ended up being an alternative for #546. 🤔

In here I made sure that proper comparison operators would exist for traccc::cell and traccc::cell_module, and then just relied on STL containers for ordering the cells and modules correctly in memory. Instead of the nigh unreadable current logic of the code.

We are looking at this with @stephenswat right now because of #527. Just as #546, this update also ensures that there would not be any duplicate cells read in from the CSV files. Which activates the same unit test failures.

But even more weirdly, this still doesn't answer the mystery that we've been struggling with Stephen for the last hour. You see, Stephen prepared some ODD files that would exclude the duplicated cells. But something really weird is going on with those files. Because the host seeding code does not find any seeds when reading those files. While it finds plenty with the files containing the duplicate cells.

But, even more weirdly, with this PR's code, I see the following using the CUDA test:

[bash][atspot01]:traccc > ./traccc/out/build/sycl/bin/traccc_seq_example_cuda --detector-file=geometries/odd/odd_geometry_detray.json --use-detray-detector --digitization-file=geometries/odd/odd-digi-geometric-config.json --input-directory=odd/muon100GeV-geant4/ --input-events=1 --compare-with-cpu

Running Full Tracking Chain Using CUDA

>>> Detector Options <<<
  Detector file       : geometries/odd/odd_geometry_detray.json
  Material file       : 
  Surface rid file    : 
  Use detray::detector: yes
  Digitization file   : geometries/odd/odd-digi-geometric-config.json
>>> Input Data Options <<<
  Input data format             : csv
  Input directory               : odd/muon100GeV-geant4/
  Number of input events        : 1
  Number of input events to skip: 0
>>> Clusterization Options <<<
  Target cells per partition: 1024
>>> Track Seeding Options <<<
  None
>>> Performance Measurement Options <<<
  Run performance checks: no
>>> Accelerator Options <<<
  Compare with CPU results: yes

WARNING: No material in detector
WARNING: No entries in volume finder
Detector check: OK
WARNING: @traccc::io::csv::read_cells: 4500 duplicate cells found in /home/krasznaa/ATLAS/projects/traccc/traccc/data/odd/muon100GeV-geant4/event000000000-cells.csv
===>>> Event 0 <<<===
Number of spacepoints: 31587 (host), 31587 (device)
  Matching rate(s):
    - 98.167% at 0.01% uncertainty
    - 99.4998% at 0.1% uncertainty
    - 99.4998% at 1% uncertainty
    - 99.4998% at 5% uncertainty
Number of seeds: 5977 (host), 5977 (device)
  Matching rate(s):
    - 54.8268% at 0.01% uncertainty
    - 93.4583% at 0.1% uncertainty
    - 98.6783% at 1% uncertainty
    - 99.1467% at 5% uncertainty
Number of track parameters: 5977 (host), 5977 (device)
  Matching rate(s):
    - 71.1059% at 0.01% uncertainty
    - 96.4029% at 0.1% uncertainty
    - 99.3642% at 1% uncertainty
    - 99.6319% at 5% uncertainty
==> Statistics ... 
- read    77592 cells from 11131 modules
- created (cpu)  31587 measurements     
- created (cpu)  31587 spacepoints     
- created (cuda) 31587 spacepoints     
- created  (cpu) 5977 seeds
- created (cuda) 5977 seeds
==>Elapsed times...
           File reading  (cpu)  140 ms
         Clusterization (cuda)  4 ms
   Spacepoint formation (cuda)  2 ms
         Clusterization  (cpu)  4 ms
   Spacepoint formation  (cpu)  0 ms
                Seeding (cuda)  13 ms
                Seeding  (cpu)  81 ms
           Track params (cuda)  3 ms
           Track params  (cpu)  1 ms
                     Wall time  253 ms
[bash][atspot01]:traccc > ./traccc/out/build/sycl/bin/traccc_seq_example_cuda --detector-file=geometries/odd/odd_geometry_detray.json --use-detray-detector --digitization-file=geometries/odd/odd-digi-geometric-config.json --input-directory=v6/odd/muon100GeV-geant4/ --input-events=1 --compare-with-cpu

Running Full Tracking Chain Using CUDA

>>> Detector Options <<<
  Detector file       : geometries/odd/odd_geometry_detray.json
  Material file       : 
  Surface rid file    : 
  Use detray::detector: yes
  Digitization file   : geometries/odd/odd-digi-geometric-config.json
>>> Input Data Options <<<
  Input data format             : csv
  Input directory               : v6/odd/muon100GeV-geant4/
  Number of input events        : 1
  Number of input events to skip: 0
>>> Clusterization Options <<<
  Target cells per partition: 1024
>>> Track Seeding Options <<<
  None
>>> Performance Measurement Options <<<
  Run performance checks: no
>>> Accelerator Options <<<
  Compare with CPU results: yes

WARNING: No material in detector
WARNING: No entries in volume finder
Detector check: OK
===>>> Event 0 <<<===
Number of spacepoints: 31587 (host), 31587 (device)
  Matching rate(s):
    - 100% at 0.01% uncertainty
    - 100% at 0.1% uncertainty
    - 100% at 1% uncertainty
    - 100% at 5% uncertainty
Number of seeds: 0 (host), 4247 (device)
  Matching rate(s):
    - 0% at 0.01% uncertainty
    - 0% at 0.1% uncertainty
    - 0% at 1% uncertainty
    - 0% at 5% uncertainty
Number of track parameters: 0 (host), 4247 (device)
  Matching rate(s):
    - 0% at 0.01% uncertainty
    - 0% at 0.1% uncertainty
    - 0% at 1% uncertainty
    - 0% at 5% uncertainty
==> Statistics ... 
- read    77592 cells from 11131 modules
- created (cpu)  31587 measurements     
- created (cpu)  31587 spacepoints     
- created (cuda) 31587 spacepoints     
- created  (cpu) 0 seeds
- created (cuda) 4247 seeds
==>Elapsed times...
           File reading  (cpu)  104 ms
         Clusterization (cuda)  8 ms
   Spacepoint formation (cuda)  6 ms
         Clusterization  (cpu)  4 ms
   Spacepoint formation  (cpu)  0 ms
                Seeding (cuda)  26 ms
                Seeding  (cpu)  0 ms
           Track params (cuda)  5 ms
           Track params  (cpu)  0 ms
                     Wall time  157 ms
[bash][atspot01]:traccc > 
  • The original file is reported to have 4500 duplicate cells. (A weirdly round number...)
  • Stephen's updated file reports no duplicates, and the same number of cells that remain from the original after duplicate removal.
  • As I wrote earlier, the host code does not like the updated CSV files. But the CUDA code still finds some seeds with those. Although noticeably fewer than with the original files.
  • When using the updated CSV files, we get a 100% agreement between the host and CUDA clusterization. But not with the original files. (Since the comparison is made on the spacepoints, and not the clusters, floating point differences are accepted here.) I have no idea what to make of this. 😕

So, something weird went wrong with the file updates, and I can just not figure out what. 😕 But at the same time, I think we should just go ahead with this deduplication in the code, and make sure that the tests would be made to work with the deduplication included. 🤔

@krasznaa krasznaa added cleanup Makes the code all clean and tidy edm Changes to the data model labels Apr 19, 2024
io/src/csv/read_cells.cpp Outdated Show resolved Hide resolved
@stephenswat
Copy link
Member

Very strange results... Will need to think a little after the meeting.

I think that if we do deduplication (and I think that adding activation energies is a good idea) one important thing will be to keep the truth data about which measurement a cell belongs to, which can be more than one. We'll need some way to return this from the read_cells function.

@krasznaa krasznaa marked this pull request as ready for review April 19, 2024 16:09
@krasznaa
Copy link
Member Author

Well, this has been eye-opening...

As it turns out, our code relies on the exact behaviour of traccc::operator<(traccc::cell, traccc::cell) somewhere. 🤔 Since after I went back to reading duplicate cells with some of the unit tests, I had to find that some of them kept failing because I changed the ordering imposed by that operator. (Thinking that I could use that operator to implement the same order that the I/O code is using currently.)

So I reverted the global operator to do what it's been doing so far, and made the helper std::set use a custom comparison functor. I'm not all to happy with this, since it makes it clear just how fragile our code / tests are, but for now let's be pragmatic... 🤔

@krasznaa krasznaa force-pushed the CellReadingSimplification-main-20240419 branch 2 times, most recently from 165aa58 to 7a673b1 Compare April 20, 2024 06:53
Made sure that proper comparison operators would exist for
traccc::cell and traccc::cell_module, and then just relied on
STL containers for ordering the cells and modules correctly
in memory.
The truth mapping code (currently) relies on possibly having
duplicate cells in the event record, for different particles
depositing energy in the same cell. The truth mapping code,
and some of the I/O unit tests, are now set up to disable the
cell de-duplication during CSV reading.

Reverted to the previous ordering scheme for traccc::cell-s,
and implemented a (different) custom ordering for the cell
collection. Since as it turns out, the project implicitly
relies on these two behaving a little differently.
This way it becomes possible to sum up the activations of the
cells, using the value in the map.
@stephenswat stephenswat force-pushed the CellReadingSimplification-main-20240419 branch from 7a673b1 to 1be9b1a Compare April 22, 2024 11:48
@krasznaa krasznaa enabled auto-merge (squash) April 22, 2024 12:01
@krasznaa krasznaa merged commit 2de0911 into acts-project:main Apr 22, 2024
17 of 18 checks passed
@krasznaa krasznaa deleted the CellReadingSimplification-main-20240419 branch April 22, 2024 12:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cleanup Makes the code all clean and tidy edm Changes to the data model
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants