Skip to content

Commit

Permalink
Doc: Compute Section (#230)
Browse files Browse the repository at this point in the history
Demonstrate how to use the Zero-Copy data access to compute
(read & write) on fields and particles. This uses the typical
AMReX level - block - compute loop pattern in AMReX.
  • Loading branch information
ax3l committed Dec 6, 2023
1 parent 33b272c commit ad4ddbe
Show file tree
Hide file tree
Showing 11 changed files with 202 additions and 12 deletions.
1 change: 1 addition & 0 deletions docs/requirements.txt
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,7 @@ recommonmark
# Sphinx<7.2 because we are waiting for
# https://github.com/breathe-doc/breathe/issues/943
sphinx>=5.3,<7.2
sphinx-copybutton
sphinx-design
sphinx_rtd_theme>=1.1.1
# reference system
Expand Down
1 change: 1 addition & 0 deletions docs/source/conf.py
Original file line number Diff line number Diff line change
Expand Up @@ -44,6 +44,7 @@
"sphinx.ext.autodoc",
"sphinx.ext.mathjax",
"sphinx.ext.viewcode",
"sphinx_copybutton",
"sphinx_design",
"sphinx_rtd_theme",
"breathe",
Expand Down
1 change: 1 addition & 0 deletions docs/source/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -66,6 +66,7 @@ Usage
usage/how_to_run
usage/api
usage/zerocopy
usage/compute
usage/workflows
.. usage/tests
Expand Down
6 changes: 6 additions & 0 deletions docs/source/usage/api.rst
Original file line number Diff line number Diff line change
Expand Up @@ -22,6 +22,8 @@ For brevity, below the 3D APIs are shown.
pyAMReX classes and functions follow the same structure as the `C++ AMReX APIs <https://amrex-codes.github.io/amrex/doxygen/>`__.


.. _usage-api-base:

Base
----

Expand Down Expand Up @@ -186,6 +188,8 @@ Utility
.. autofunction:: amrex.space3d.write_single_level_plotfile


.. _usage-api-amrcore:

AmrCore
-------

Expand All @@ -198,6 +202,8 @@ AmrCore
:undoc-members:


.. _usage-api-particles:

Particles
---------

Expand Down
102 changes: 102 additions & 0 deletions docs/source/usage/compute.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,102 @@
.. _usage-compute:

Compute
=======

With zero-copy read and write access to data structures, this section presents how to compute in pyAMReX.

Since the pyAMReX data containers are wrappers to the C++ AMReX objects, it is worth reading:

* `AMReX Basics <https://amrex-codes.github.io/amrex/docs_html/Basics_Chapter.html>`__ and
* `AMReX parallelization strategy for MPI+X (e.g, GPUs, CPUs) <https://amrex-codes.github.io/amrex/docs_html/GPU.html>`__.

As a very short, simplified overview, this narrows down to:

* AMReX decomposes **index space** into **rectangular, block-structured, regular grids**,
* block are often intentionally slightly **over-decomposed**, there is >= one block per compute unit (CPU core or GPU),
* **particles** are chunked/tiled and usually decomposed like the field blocks,
* **refinement levels** are represented as (potentially sparse) levels of blocks.

Computations are thus performed (mostly) on whole blocks, which enables to use compute advanced acceleration techniques on CPUs or GPUs.


.. _usage-compute-fields:

Fields
------

The most common data structure to interact with is a `MultiFab <https://amrex-codes.github.io/amrex/docs_html/Basics.html#fabarray-multifab-and-imultifab>`__, which is a collection of boxes with associated field data.
The field data can have more than one component (in the slowest varying index), but all components have the same `staggering/centering <https://amrex-codes.github.io/amrex/docs_html/Basics.html#box>`__.

This is how to iterate and potentially compute for all blocks assigned to a local process in pyAMReX:

.. literalinclude:: ../../../tests/test_multifab.py
:language: python3
:dedent: 4
:start-after: # Manual: Compute Mfab START
:end-before: # Manual: Compute Mfab END

For a complete physics example that uses CPU/GPU agnostic Python code for computation on fields, see:

* `Heat Equation example <https://github.com/AMReX-Codes/amrex-tutorials/blob/main/GuidedTutorials/HeatEquation/Source/main.py>`__

For many small CPU and GPU examples on how to compute on fields, see the following test cases:

* `MultiFab example <https://github.com/AMReX-Codes/amrex-tutorials/blob/main/GuidedTutorials/MultiFab/main.py>`__

* .. dropdown:: Examples in ``test_multifab.py``

.. literalinclude:: ../../../tests/test_multifab.py
:language: python3
:caption: This files is in ``tests/test_multifab.py``.


.. _usage-compute-particles:

Particles
---------

AMReX `Particles <https://amrex-codes.github.io/amrex/docs_html/Particle_Chapter.html>`__ are stored in the `ParticleContainer <https://amrex-codes.github.io/amrex/docs_html/Particle.html#the-particlecontainer>`__ class.

There are a few small differences to the `iteration over a ParticleContainer <https://amrex-codes.github.io/amrex/docs_html/Particle.html#iterating-over-particles>`__ compared to a ``MultiFab``:

* ``ParticleContainer`` is aware of mesh-refinement levels,
* AMReX supports a variety of data layouts for particles (AoS and SoA + runtime SoA attributes), which requires a few more calls to access.

Here is the general structure for computing on particles:

.. literalinclude:: ../../../tests/test_particleContainer.py
:language: python3
:dedent: 4
:start-after: # Manual: Compute PC START
:end-before: # Manual: Compute PC END

For many small CPU and GPU examples on how to compute on particles, see the following test cases:

* .. dropdown:: Examples in ``test_particleContainer.py``

.. literalinclude:: ../../../tests/test_particleContainer.py
:language: python3
:caption: This files is in ``tests/test_particleContainer.py``.

* .. dropdown:: Examples in ``test_aos.py``

.. literalinclude:: ../../../tests/test_aos.py
:language: python3
:caption: This files is in ``tests/test_aos.py``.

* .. dropdown:: Examples in ``test_soa.py``

.. literalinclude:: ../../../tests/test_soa.py
:language: python3
:caption: This files is in ``tests/test_soa.py``.


Other C++ Calls
---------------

pyAMReX exposes many more C++-written and GPU-accelerated AMReX functions for :py:class:`~amrex.space3d.MultiFab` and :ref:`particles <usage-api-particles>` to Python, which can be used to set, reduce, rescale, redistribute, etc. contained data.
Check out the detailed :ref:`API docs for more details <usage-api>` and use further third party libraries at will on the exposed data structures, replacing the hot loops described above.

You can also leave the Python world in pyAMReX and go back to C++ whenever needed.
For :ref:`some applications <usage_run>`, pyAMReX might act as *scriptable glue* that transports fields and particles from one (C++) function to another without recompilation, by exposing the functions and methods to Python.
2 changes: 1 addition & 1 deletion docs/source/usage/zerocopy.rst
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@ Zero-Copy
=========

pyAMReX is designed to bridge the worlds of block-structured codes and data science.
As such, it includes zero-copy GPU data access for AI/ML, in situ analysis, application coupling by implementing :ref:`standardized data interfaces <developers-implementation`.
As such, it includes zero-copy GPU data access for AI/ML, in situ analysis, application coupling by implementing :ref:`standardized data interfaces <developers-implementation>`.


CPU: numpy
Expand Down
1 change: 1 addition & 0 deletions docs/spack.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -23,5 +23,6 @@ spack:
- py-breathe
- py-recommonmark
- py-pygments
- py-sphinx-copybutton
- py-sphinx-design
- py-sphinx-rtd-theme
6 changes: 3 additions & 3 deletions src/AmrCore/AmrMesh.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -68,9 +68,9 @@ void init_AmrMesh(py::module &m)
>(),
py::arg("rb"), py::arg("max_level_in"), py::arg("n_cell_in"), py::arg("coord"), py::arg("ref_ratios"), py::arg("is_per"))

.def("Verbose", &AmrMesh::Verbose)
.def("max_level", &AmrMesh::maxLevel)
.def("finest_level", &AmrMesh::finestLevel)
.def_property_readonly("verbose", &AmrMesh::Verbose)
.def_property_readonly("max_level", &AmrMesh::maxLevel)
.def_property_readonly("finest_level", &AmrMesh::finestLevel)
.def("ref_ratio", py::overload_cast< >(&AmrMesh::refRatio, py::const_))
.def("ref_ratio", py::overload_cast< int >(&AmrMesh::refRatio, py::const_))
;
Expand Down
6 changes: 3 additions & 3 deletions src/Base/MultiFab.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -113,7 +113,7 @@ void init_MultiFab(py::module &m)
.def_property_readonly("num_comp", &FabArrayBase::nComp)
.def_property_readonly("size", &FabArrayBase::size)

.def_property_readonly("nGrowVect", &FabArrayBase::nGrowVect)
.def_property_readonly("n_grow_vect", &FabArrayBase::nGrowVect)

/* data access in Box index space */
.def("__iter__",
Expand Down Expand Up @@ -410,8 +410,8 @@ void init_MultiFab(py::module &m)

.def("box_array", &MultiFab::boxArray)
.def("dm", &MultiFab::DistributionMap)
.def("n_comp", &MultiFab::nComp)
.def("n_grow_vect", &MultiFab::nGrowVect)
.def_property_readonly("n_comp", &MultiFab::nComp)
.def_property_readonly("n_grow_vect", &MultiFab::nGrowVect)

/* masks & ownership */
// TODO:
Expand Down
48 changes: 43 additions & 5 deletions tests/test_multifab.py
Original file line number Diff line number Diff line change
Expand Up @@ -8,9 +8,47 @@
import amrex.space3d as amr


def test_mfab_numpy(mfab):
"""Used in docs/source/usage/compute.rst"""

# Manual: Compute Mfab START
# finest active MR level, get from a simulation's AmrMesh object, e.g.:
# finest_level = sim.finest_level
finest_level = 0 # no MR

# iterate over every mesh-refinement levels
for lev in range(finest_level + 1):
# get an existing MultiFab, e.g., from a simulation:
# mfab = sim.get_field(lev=lev) # code-specific getter function

# grow (aka guard/ghost/halo) regions
ngv = mfab.n_grow_vect

# get every local block of the field
for mfi in mfab:
# global index space box, including guards
bx = mfi.tilebox().grow(ngv)
print(bx) # note: global index space of this block

# numpy representation: non-copying view, including the
# guard/ghost region
field_np = mfab.array(mfi).to_numpy()

# notes on indexing in field_np:
# - numpy uses locally zero-based indexing
# - layout is F_CONTIGUOUS by default, just like AMReX

# notes:
# Only the next lines are the "HOT LOOP" of the computation.
# For efficiency, use numpy array operation for speed on CPUs.
# For GPUs use .to_cupy() above and compute with cupy or numba.
field_np[()] = 42.0
# Manual: Compute Mfab END


def test_mfab_loop(mfab):
ngv = mfab.nGrowVect
print(f"\n mfab={mfab}, mfab.nGrowVect={ngv}")
ngv = mfab.n_grow_vect
print(f"\n mfab={mfab}, mfab.n_grow_vect={ngv}")

for mfi in mfab:
bx = mfi.tilebox().grow(ngv)
Expand Down Expand Up @@ -160,7 +198,7 @@ def test_mfab_ops_cuda_numba(mfab_device):
# https://numba.pydata.org/numba-doc/dev/cuda/cuda_array_interface.html
from numba import cuda

ngv = mfab_device.nGrowVect
ngv = mfab_device.n_grow_vect

# assign 3: define kernel
@cuda.jit
Expand Down Expand Up @@ -197,8 +235,8 @@ def test_mfab_ops_cuda_cupy(mfab_device):
import cupyx.profiler

# AMReX -> cupy
ngv = mfab_device.nGrowVect
print(f"\n mfab_device={mfab_device}, mfab_device.nGrowVect={ngv}")
ngv = mfab_device.n_grow_vect
print(f"\n mfab_device={mfab_device}, mfab_device.n_grow_vect={ngv}")

# assign 3
with cupyx.profiler.time_range("assign 3 [()]", color_id=0):
Expand Down
40 changes: 40 additions & 0 deletions tests/test_particleContainer.py
Original file line number Diff line number Diff line change
Expand Up @@ -276,6 +276,46 @@ def test_per_cell(empty_particle_container, std_geometry, std_particle):
assert ncells * std_particle.real_array_data[1] == sum_1


def test_pc_numpy(particle_container, Npart):
"""Used in docs/source/usage/compute.rst"""
pc = particle_container

# Manual: Compute PC START
# code-specific getter function, e.g.:
# pc = sim.get_particles()

# iterate over every mesh-refinement levels (no MR: lev=0)
for lvl in range(pc.finest_level + 1):
# get every local chunk of particles
for pti in pc.iterator(pc, level=lvl):
# default layout: AoS with positions and cpuid
# note: not part of the new PureSoA particle container layout
aos = pti.aos().to_numpy()

# additional compile-time and runtime attributes in SoA format
soa = pti.soa().to_numpy()

# notes:
# Only the next lines are the "HOT LOOP" of the computation.
# For efficiency, use numpy array operation for speed on CPUs.
# For GPUs use .to_cupy() above and compute with cupy or numba.

# print all particle ids in the chunk
print(aos[()]["cpuid"])

# write to all particles in the chunk
aos[()]["x"] = 0.30
aos[()]["y"] = 0.35
aos[()]["z"] = 0.40

for soa_real in soa.real:
soa_real[()] = 42.0

for soa_int in soa.int:
soa_int[()] = 12
# Manual: Compute PC END


@pytest.mark.skipif(
importlib.util.find_spec("pandas") is None, reason="pandas is not available"
)
Expand Down

0 comments on commit ad4ddbe

Please sign in to comment.