Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Slicing ak.Array in cuda backend breaks #3133

Open
essoca opened this issue May 28, 2024 · 6 comments
Open

Slicing ak.Array in cuda backend breaks #3133

essoca opened this issue May 28, 2024 · 6 comments
Labels
bug (unverified) The problem described would be a bug, but needs to be triaged gpu Concerns the GPU implementation (backend = "cuda')

Comments

@essoca
Copy link

essoca commented May 28, 2024

Version of Awkward Array

2.6.4

Description and code to reproduce

Hey guys,

While playing with simple slices of ak.Arrays, I noticed the following:

a = ak.Array([[[0, 1], [2, 3]], [[4, 5], [6, 7]]], backend='cpu')

>>> print(a[:, :, 0])
[[0, 2], [4, 6]]

>>> print(a[:, ::-1])
[[[2, 3], [0, 1]], [[6, 7], [4, 5]]]

So, in the cpu backend, this works as expected. Now cuda's turn

a = ak.Array([[[0, 1], [2, 3]], [[4, 5], [6, 7]]], backend='cuda')

>>> print(a)
[[[0, 1], [2, 3]], [[4, 5], [6, 7]]]

>>> print(a[:, :, 0])

Traceback (most recent call last):
  File ".../miniforge3/envs/qb/lib/python3.10/site-packages/awkward/highlevel.py", line 1065, in __getitem__
    prepare_layout(self._layout[where]),
  File ".../miniforge3/envs/qb/lib/python3.10/site-packages/awkward/contents/content.py", line 512, in __getitem__
    return self._getitem(where)
  File ".../miniforge3/envs/qb/lib/python3.10/site-packages/awkward/contents/content.py", line 557, in _getitem
    out = next._getitem_next(nextwhere[0], nextwhere[1:], None)
  File ".../miniforge3/envs/qb/lib/python3.10/site-packages/awkward/contents/regulararray.py", line 498, in _getitem_next
    self._backend[
  File ".../miniforge3/envs/qb/lib/python3.10/site-packages/awkward/_backends/cupy.py", line 38, in __getitem__
    _cuda_kernels = cuda.initialize_cuda_kernels(cupy)
  File ".../miniforge3/envs/qb/lib/python3.10/site-packages/awkward/_connect/cuda/__init__.py", line 187, in initialize_cuda_kernels
    import awkward._connect.cuda._kernel_signatures
ModuleNotFoundError: No module named 'awkward._connect.cuda._kernel_signatures'

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File ".../miniforge3/envs/qb/lib/python3.10/site-packages/awkward/highlevel.py", line 1063, in __getitem__
    with ak._errors.SlicingErrorContext(self, where):
  File ".../miniforge3/envs/qb/lib/python3.10/site-packages/awkward/_errors.py", line 85, in __exit__
    self.handle_exception(exception_type, exception_value)
  File ".../miniforge3/envs/qb/lib/python3.10/site-packages/awkward/_errors.py", line 95, in handle_exception
    raise self.decorate_exception(cls, exception)
ModuleNotFoundError: No module named 'awkward._connect.cuda._kernel_signatures'

This error occurred while attempting to slice

    <Array [[[0, 1], [2, 3]], [[...], ...]] type='2 * var * var * int64'>

with

    (:, :, 0)

Am I missing something here?

@essoca essoca added the bug (unverified) The problem described would be a bug, but needs to be triaged label May 28, 2024
@agoose77
Copy link
Collaborator

That's an interesting error. How did you install awkward? Did you pip install from the Git repo perchance?

@jpivarski
Copy link
Member

This is expected to break because it would use a not-yet-existing kernel, awkward_ListArray_getitem_next_range_carrylength, but your error message says that awkward._connect.cuda._kernel_signatures is missing, which implies that it was installed incorrectly (as @agoose77 said).

If you've ever installed Awkward with the developer instructions, remove it

pip uninstall awkward awkward-cpp

get a new clone of Awkward's git repo, and reinstall, starting from the nox step. As @ManasviGoyal has been adding (and changing) kernels, you need all of the generated header files (including awkward._connect.cuda._kernel_signatures) to be completely up-to-date and consistent.

After that (I just did it), you should see

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/jpivarski/irishep/awkward/src/awkward/highlevel.py", line 1065, in __getitem__
    prepare_layout(self._layout[where]),
                   ~~~~~~~~~~~~^^^^^^^
  File "/home/jpivarski/irishep/awkward/src/awkward/contents/content.py", line 512, in __getitem__
    return self._getitem(where)
           ^^^^^^^^^^^^^^^^^^^^
  File "/home/jpivarski/irishep/awkward/src/awkward/contents/content.py", line 557, in _getitem
    out = next._getitem_next(nextwhere[0], nextwhere[1:], None)
          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/jpivarski/irishep/awkward/src/awkward/contents/regulararray.py", line 518, in _getitem_next
    nextcontent._getitem_next(nexthead, nexttail, advanced),
    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/jpivarski/irishep/awkward/src/awkward/contents/listarray.py", line 757, in _getitem_next
    self._backend[
  File "/home/jpivarski/irishep/awkward/src/awkward/_backends/cupy.py", line 43, in __getitem__
    raise AssertionError(f"CuPyKernel not found: {index!r}")
AssertionError: CuPyKernel not found: ('awkward_ListArray_getitem_next_range_carrylength', <class 'numpy.int64'>, <class 'numpy.int64'>, <class 'numpy.int64'>)

because the awkward_ListArray_getitem_next_range_carrylength needs to be ported to CUDA. It will be, soon.

@ManasviGoyal
Copy link
Collaborator

awkward_ListArray_getitem_next_range_carrylength has already been implemented in #3130. It will be merged in a few days.

@essoca
Copy link
Author

essoca commented May 29, 2024

That's an interesting error. How did you install awkward? Did you pip install from the Git repo perchance?

Yes @agoose77, I pip-installed it a couple of weeks ago from the commit a096f3d, since I wanted to test the fix #3115.

@jpivarski: when is 2.6.5 released? Will the PR from @ManasviGoyal fixing #3130 be included in that release?

@jpivarski
Copy link
Member

awkward_ListArray_getitem_next_range_carrylength has already been implemented in #3130. It will be merged in a few days.

So in principle, doing

git checkout ManasviGoyal/improve-variable-length-loop-kernels

before

nox -s prepare
python -m pip install -v ./awkward-cpp
python -m pip install -e .

(in a freshly cloned directory, so that all of the headers made with nox -s prepare are new) should be able to slice in CUDA.

@jpivarski: when is 2.6.5 released? Will the PR from @ManasviGoyal fixing #3130 be included in that release?

Awkward 2.6.5 was released yesterday, and it includes #3115.

When @ManasviGoyal is done with #3130, I'll review it and merge it if there are no issues, and if it's useful to put that in a release, I'll do it. #3130 will need a new awkward-cpp, which takes more time and PyPI quota than a regular release, so they tend to be spaced out more, but if you need it, I'll make a release.

@essoca
Copy link
Author

essoca commented May 30, 2024

@jpivarski: many thanks for the instructions 👍 Being able to slice large data in CUDA (following numpy syntax) is a very important operation, in my opinion. It would be cool if you release it as soon as it is ready!

@jpivarski jpivarski added the gpu Concerns the GPU implementation (backend = "cuda') label May 31, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug (unverified) The problem described would be a bug, but needs to be triaged gpu Concerns the GPU implementation (backend = "cuda')
Projects
None yet
Development

No branches or pull requests

4 participants