Skip to content

Commit

Permalink
Spark Murmur3 hash functionality(#7024)
Browse files Browse the repository at this point in the history
Resolves #6863

Expands existing murmur3 hashing functionality to match Spark's murmur3 hashing algorithm by modifying tail processing for unaligned bytes and processing booleans as 32bit integers rather than singular bytes.

Authors:
  - Ryan Lee <ryanlee@nvidia.com>
  - rwlee <rwlee@users.noreply.github.com>

Approvers:
  - Jake Hemstad
  - null
  - Robert (Bobby) Evans
  - GALI PREM SAGAR

URL: #7024
  • Loading branch information
rwlee authored Jan 4, 2021
1 parent fc92bb9 commit 8860baf
Show file tree
Hide file tree
Showing 9 changed files with 419 additions and 88 deletions.
1 change: 1 addition & 0 deletions cpp/include/cudf/detail/hashing.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -46,6 +46,7 @@ std::unique_ptr<column> md5_hash(
rmm::cuda_stream_view stream = rmm::cuda_stream_default,
rmm::mr::device_memory_resource* mr = rmm::mr::get_current_device_resource());

template <template <typename> class hash_function>
std::unique_ptr<column> serial_murmur_hash3_32(
table_view const& input,
uint32_t seed = 0,
Expand Down
Loading

0 comments on commit 8860baf

Please sign in to comment.