Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor Observable_stat framework #3712

Merged
merged 30 commits into from
May 14, 2020
Merged

Conversation

jngrad
Copy link
Member

@jngrad jngrad commented May 13, 2020

Description of changes:

  • convert Observable_stat structs to self-contained classes
    • remove manual memory management when resizing Observable_stat objects
    • replace hazardous raw pointers by Utils::Span
    • hide observable implementation details
      • accumulation of the flat array and MPI reduction are done by the class
      • finding the contribution of specific bonded/non-bonded IAs is done by the class
  • dependency inversions
    • replace manually-curated Observable_stat list in invalidate_obs() by a self-updating callback list
    • decouple observables from the pressure/energy/coulomb/dipolar frameworks
    • remove most statistics.hpp includes (it's now only visible to 9 source files)
  • analysis energy() now returns the lower triangle of the non-bonded IAs, to be consistent with pressure() and stress_tensor()
  • improve Doxygen documentation of the pressure/energy/coulomb/dipolar frameworks

jngrad added 24 commits May 13, 2020 14:14
Use size_t consistently and return coulomb/dipolar sizes by value
instead of using output parameters.
Move the reallocation logic into the observable struct and narrow
function signature.
Moving update_pressure() to pressure.cpp significantly narrows down
the number of exported pressure functions.
Rewrite mpi_gather_stats() with an enum class for the job type and
document the output parameters.
Reduce code duplication and hide some implementation details.
Having two flags helps in reducing code duplication.
Exposing raw pointers to the underlying C-array of the std::vector
is prone to buffer overflows, which will happen if the contributions
from coulomb and dipole are written after the contributions from
virtual sites and external fields when no coulomb and dipole methods
are active. With Spans, a view is returned whose operator[] guards
against out-of-bounds access via assertions.
Make the observables fully self-contained. Remove the manual
reallocation management logic. Adapt the Coulomb/Dipole code
accordingly. Keep track of created observables instead of
manually curating a list of global variables in statistics.cpp.
Remove dependency on statistics.hpp from most of the core.
Reduce the visibility of Observable_stat globals.
Remove the runtime error message side-effect. The inlined versions
are trivial; there is no need for callbacks in the Observable_stat
class.
Resize and zero-out local (aka MPI rank-specific) obsstats. Resize
but don't zero-out total (aka on master node) obsstats, since the
reduction will overwrite old values. Move the `is_initialized` and
`v_comp` flags to the wrapper classes and only invalidate on system
change. There is no need to reallocate on system change, since
invalidation already implies reallocation, thus the reallocation
callbacks can be removed. Reduce code duplication in pressure.cpp.
@jngrad
Copy link
Member Author

jngrad commented May 13, 2020

@fweik @RudolfWeeber the old implementation with raw pointers gave me quite some trouble. The coulomb/dipolar contributions were overflowing a zero into the virtual site and external field contributions when no coulomb/dipolar method was active, because some places in the coulomb/dipolar calculation code forgot to guard against n_coulomb==0 or n_dipolar==0 and were overflowing a single zero into the virtual site or external field. I also encountered overflows when switching between P3M and ELC (this increases the number of coulomb contributions without increasing the size of the array), although it surfaced halfway through the refactor and I can't tell by reading the original pointer implementation if this was an existing bug or a regression from my side. In any case, we didn't run into buffer overflow crashes because the virtual site and external field offered just enough padding to prevent out-of-bounds access. As long as the contributions from coulomb, dipolar, virtual site and external field were written to the array in that order, the overflowing values were eventually replaced by the correct values. The code is now properly guarded and ASAN doesn't complain.

This PR achieved most of the work needed to convert Observable_stat structs into genuine Observable classes. It's essentially a matter of either moving the logic from analysis.pyx to observables.py (and converting it to a Cython file), or registering the Observable_stat classes in the ScriptInterface framework. I haven't done that, because we should discuss it first. There are a few open questions from my side:

  • all dipolar and most coulomb methods don't contribute to the pressure
  • intra- and inter-molecular non-bonded energies are commented out in the python interface and inexistent in the core (edit: removed in 5693cc8)
  • the StressTensor observable seems to short-circuit the ideal gas component of NpT (edit: most likely a mistake, it's been present since the start in aa60df4, removed in 17349d1)

If moving to genuine Observable classes requires too much work, we could defer it to a future PR. This PR already provides significant improvements in terms of code clarity and pointer safety.

@codecov
Copy link

codecov bot commented May 13, 2020

Codecov Report

Merging #3712 into python will decrease coverage by 0%.
The diff coverage is 93%.

Impacted file tree graph

@@          Coverage Diff           @@
##           python   #3712   +/-   ##
======================================
- Coverage      88%     88%   -1%     
======================================
  Files         543     545    +2     
  Lines       24776   24718   -58     
======================================
- Hits        21835   21776   -59     
- Misses       2941    2942    +1     
Impacted Files Coverage Δ
src/core/AtomDecomposition.hpp 100% <ø> (ø)
src/core/actor/ActorList.hpp 100% <ø> (ø)
src/core/communication.hpp 100% <ø> (ø)
src/core/constraints/Constraint.hpp 100% <ø> (ø)
src/core/constraints/ShapeBasedConstraint.hpp 100% <ø> (ø)
src/core/cuda_interface.cpp 100% <ø> (ø)
...e/electrostatics_magnetostatics/coulomb_inline.hpp 93% <ø> (-2%) ⬇️
src/core/electrostatics_magnetostatics/dipole.cpp 64% <ø> (-4%) ⬇️
...re/electrostatics_magnetostatics/dipole_inline.hpp 95% <ø> (ø)
src/core/event.cpp 94% <ø> (ø)
... and 30 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 7bd69f1...3d66b10. Read the comment docs.

Also remove unused header files and move CUDA helper function back
into the corresponding .cu file.
@fweik fweik self-assigned this May 14, 2020
@fweik
Copy link
Contributor

fweik commented May 14, 2020

I'll have a look this afternoon. I'm not convinced yet that we'll keep all the functionality, we'll see...

@fweik
Copy link
Contributor

fweik commented May 14, 2020

@jngrad Do I understand it correctly that the globals are only needed to avoid recalculation of the observables?

@jngrad
Copy link
Member Author

jngrad commented May 14, 2020

offline discussion: By the looks of it, these globals are write-only from the point of view of the core. They are read only from the ScriptInterface and the StressTensor observable. There is no need to cache these values. We could just split up the different contributions into separate observables, and remove all the MPI callbacks triggered by system changes. This can be done in a separate PR.

The value stored in the local (aka MPI rank-specific) obsstat could
not be read, since it was not followed by a reduction on the master
node.
The intra- and inter-molecular parts of the non-bonded contributions
to the energy are not implemented in the core.
src/core/Observable_stat.hpp Outdated Show resolved Hide resolved
Copy link
Contributor

@fweik fweik left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, clearly an improvement!

@jngrad jngrad added the automerge Merge with kodiak label May 14, 2020
@kodiakhq kodiakhq bot merged commit 5d8e7a5 into espressomd:python May 14, 2020
@jngrad jngrad deleted the obsstat-refactor branch January 18, 2022 12:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants