Skip to content

Commit

Permalink
Memory Profiling (#15866)
Browse files Browse the repository at this point in the history
Use [RMM's new memory profiler](rapidsai/rmm#1563) to profile all functions already decorated with `_cudf_nvtx_annotate`.

Example 
```python
import cudf
from cudf.utils.performance_tracking import print_memory_report

cudf.set_option("memory_profiling", True)

df1 = cudf.DataFrame({"a": [1, 2, 3]})
df2 = cudf.DataFrame({"a": [2, 2, 3]})
df3 = df1.merge(df2)

print_memory_report()
```

Output:
```
Memory Profiling
================

Ordered by: memory_peak

ncalls     memory_peak    memory_total  filename:lineno(function)
     1             272             688  /home/mkristensen/apps/miniforge3/envs/rmm-cudf-0527/lib/python3.11/site-packages/cudf/core/dataframe.py:4072(DataFrame.merge)
     2              32              64  /home/mkristensen/apps/miniforge3/envs/rmm-cudf-0527/lib/python3.11/site-packages/cudf/core/dataframe.py:1043(DataFrame._init_from_dict_like)
     2              32              64  /home/mkristensen/apps/miniforge3/envs/rmm-cudf-0527/lib/python3.11/site-packages/cudf/core/dataframe.py:690(DataFrame.__init__)
     2               0               0  /home/mkristensen/apps/miniforge3/envs/rmm-cudf-0527/lib/python3.11/site-packages/cudf/core/dataframe.py:1131(DataFrame._align_input_series_indices)
     7               0               0  /home/mkristensen/apps/miniforge3/envs/rmm-cudf-0527/lib/python3.11/site-packages/cudf/core/index.py:214(RangeIndex.__init__)
     6               0               0  /home/mkristensen/apps/miniforge3/envs/rmm-cudf-0527/lib/python3.11/site-packages/cudf/core/index.py:424(RangeIndex.__len__)
     4               0               0  /home/mkristensen/apps/miniforge3/envs/rmm-cudf-0527/lib/python3.11/site-packages/cudf/core/frame.py:271(Frame.__len__)
     2               0               0  /home/mkristensen/apps/miniforge3/envs/rmm-cudf-0527/lib/python3.11/site-packages/cudf/core/dataframe.py:3195(DataFrame._insert)
     2               0               0  /home/mkristensen/apps/miniforge3/envs/rmm-cudf-0527/lib/python3.11/site-packages/cudf/core/index.py:270(RangeIndex.name)
     2               0               0  /home/mkristensen/apps/miniforge3/envs/rmm-cudf-0527/lib/python3.11/site-packages/cudf/core/index.py:369(RangeIndex.copy)
     5               0               0  /home/mkristensen/apps/miniforge3/envs/rmm-cudf-0527/lib/python3.11/site-packages/cudf/core/frame.py:134(Frame._from_data)
     2               0               0  /home/mkristensen/apps/miniforge3/envs/rmm-cudf-0527/lib/python3.11/site-packages/cudf/core/frame.py:1039(Frame._copy_type_metadata)
     2               0               0  /home/mkristensen/apps/miniforge3/envs/rmm-cudf-0527/lib/python3.11/site-packages/cudf/core/indexed_frame.py:315(IndexedFrame._from_columns_like_self)
```

Authors:
  - Mads R. B. Kristensen (https://github.com/madsbk)

Approvers:
  - Mark Harris (https://github.com/harrism)
  - Lawrence Mitchell (https://github.com/wence-)
  - Vyas Ramasubramani (https://github.com/vyasr)

URL: #15866
  • Loading branch information
madsbk authored Jun 28, 2024
1 parent e35da6b commit 6b04fd3
Show file tree
Hide file tree
Showing 28 changed files with 885 additions and 718 deletions.
1 change: 1 addition & 0 deletions docs/cudf/source/user_guide/api_docs/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -26,3 +26,4 @@ This page provides a list of all publicly accessible modules, methods and classe
options
extension_dtypes
pylibcudf/index.rst
performance_tracking
12 changes: 12 additions & 0 deletions docs/cudf/source/user_guide/api_docs/performance_tracking.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
.. _api.performance_tracking:

====================
Performance Tracking
====================

.. currentmodule:: cudf.utils.performance_tracking
.. autosummary::
:toctree: api/

get_memory_records
print_memory_report
1 change: 1 addition & 0 deletions docs/cudf/source/user_guide/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -16,5 +16,6 @@ options
performance-comparisons/index
PandasCompat
copy-on-write
memory-profiling
pandas-2.0-breaking-changes
```
44 changes: 44 additions & 0 deletions docs/cudf/source/user_guide/memory-profiling.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,44 @@
(memory-profiling-user-doc)=

# Memory Profiling

Peak memory usage is a common concern in GPU programming because GPU memory is typically smaller than available CPU memory. To easily identify memory hotspots, cuDF provides a memory profiler. It comes with an overhead so avoid using it in performance-sensitive code.

## Enabling Memory Profiling

First, enable memory profiling in RMM by calling {py:func}`rmm.statistics.enable_statistics()`. This adds a statistics resource adaptor to the current RMM memory resource, which enables cuDF to access memory profiling information. See the [RMM documentation](https://docs.rapids.ai/api/rmm/stable/guide/#memory-statistics-and-profiling) for more details.

Second, enable memory profiling in cuDF by setting the `memory_profiling` option to `True`. Use {py:func}`cudf.set_option` or set the environment variable ``CUDF_MEMORY_PROFILING=1`` prior to the launch of the Python interpreter.

To get the result of the profiling, use {py:func}`cudf.utils.performance_tracking.print_memory_report` or access the raw profiling data by using: {py:func}`cudf.utils.performance_tracking.get_memory_records`.

### Example
In the following, we enable profiling, do some work, and then print the profiling results:

```python
>>> import cudf
>>> from cudf.utils.performance_tracking import print_memory_report
>>> from rmm.statistics import enable_statistics
>>> enable_statistics()
>>> cudf.set_option("memory_profiling", True)
>>> cudf.DataFrame({"a": [1, 2, 3]}) # Some work
a
0 1
1 2
2 3
>>> print_memory_report() # Pretty print the result of the profiling
Memory Profiling
================

Legends:
ncalls - number of times the function or code block was called
memory_peak - peak memory allocated in function or code block (in bytes)
memory_total - total memory allocated in function or code block (in bytes)

Ordered by: memory_peak

ncalls memory_peak memory_total filename:lineno(function)
1 32 32 cudf/core/dataframe.py:690(DataFrame.__init__)
2 0 0 cudf/core/index.py:214(RangeIndex.__init__)
6 0 0 cudf/core/index.py:424(RangeIndex.__len__)
```
4 changes: 2 additions & 2 deletions python/cudf/cudf/core/buffer/spill_manager.py
Original file line number Diff line number Diff line change
Expand Up @@ -18,14 +18,14 @@
import rmm.mr

from cudf.options import get_option
from cudf.utils.nvtx_annotation import _cudf_nvtx_annotate
from cudf.utils.performance_tracking import _performance_tracking
from cudf.utils.string import format_bytes

if TYPE_CHECKING:
from cudf.core.buffer.spillable_buffer import SpillableBufferOwner

_spill_cudf_nvtx_annotate = partial(
_cudf_nvtx_annotate, domain="cudf_python-spill"
_performance_tracking, domain="cudf_python-spill"
)


Expand Down
7 changes: 4 additions & 3 deletions python/cudf/cudf/core/buffer/spillable_buffer.py
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,7 @@
from typing import TYPE_CHECKING, Any, Literal

import numpy
import nvtx
from typing_extensions import Self

import rmm
Expand All @@ -21,7 +22,7 @@
host_memory_allocation,
)
from cudf.core.buffer.exposure_tracked_buffer import ExposureTrackedBuffer
from cudf.utils.nvtx_annotation import _get_color_for_nvtx, annotate
from cudf.utils.performance_tracking import _get_color_for_nvtx
from cudf.utils.string import format_bytes

if TYPE_CHECKING:
Expand Down Expand Up @@ -200,7 +201,7 @@ def spill(self, target: str = "cpu") -> None:
)

if (ptr_type, target) == ("gpu", "cpu"):
with annotate(
with nvtx.annotate(
message="SpillDtoH",
color=_get_color_for_nvtx("SpillDtoH"),
domain="cudf_python-spill",
Expand All @@ -218,7 +219,7 @@ def spill(self, target: str = "cpu") -> None:
# trigger a new call to this buffer's `spill()`.
# Therefore, it is important that spilling-on-demand doesn't
# try to unspill an already locked buffer!
with annotate(
with nvtx.annotate(
message="SpillHtoD",
color=_get_color_for_nvtx("SpillHtoD"),
domain="cudf_python-spill",
Expand Down
Loading

0 comments on commit 6b04fd3

Please sign in to comment.