Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature: Blake2xb and LtHash #40

Open
wants to merge 81 commits into
base: develop
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from 71 commits
Commits
Show all changes
81 commits
Select commit Hold shift + click to select a range
d7b6f45
blake2xb and scaffolding for lthash
Clueliss Apr 24, 2023
606c8c3
fixes for blake, implementation of lthash avx2 and x86-64, also tests
Clueliss Apr 25, 2023
7444c7c
implement sse2 engine, add benchmark
Clueliss Apr 26, 2023
a5f7748
allocator awareness for lthash
Clueliss May 2, 2023
160d799
update readme
Clueliss May 2, 2023
81c8424
handling sodium dep
bigerl May 2, 2023
50918c8
add initializer to make gcc12 maybe happy
Clueliss May 2, 2023
4719e7c
fix cmake with dockerfile
bigerl May 2, 2023
b1b7b38
add boilerplate_init.cmake
bigerl May 2, 2023
82813f7
fix namespaces, untemplate predefined lthashes
Clueliss May 2, 2023
7b45773
install conan for tests and override boost to make lib compilable on …
Clueliss May 3, 2023
5153540
fix typo
Clueliss May 3, 2023
549f6f6
gcc10 and clang10 too old to build library switching to gcc11 and cla…
Clueliss May 3, 2023
cc82372
fix bug in blake
Clueliss May 3, 2023
0925b17
add examples for blake and lthash
Clueliss May 3, 2023
c7521d8
try to fix packaging workflow
Clueliss May 3, 2023
c01d864
actually run new examples
Clueliss May 3, 2023
9db2699
actually run new tests
Clueliss May 3, 2023
715310c
move boost requirement to testing only and remove unneeded stuff from…
Clueliss May 3, 2023
ef52438
implement thin wrapper for blake2b
Clueliss May 4, 2023
011d294
add flag to enable/disable libsodium support
Clueliss May 4, 2023
01bed7d
update readme
Clueliss May 4, 2023
357dd5c
rename Blake2xb to Blake2Xb
Clueliss May 4, 2023
e09ac41
add documentation
Clueliss May 4, 2023
25fc717
fix typo in examples
Clueliss May 4, 2023
6c8b907
gcc-11 was unhappy with template specialization in struct scope
Clueliss May 4, 2023
4b5ecc8
actually run new test and example
Clueliss May 4, 2023
a918f98
fix bug in blake2b init
Clueliss May 4, 2023
1b57910
attempt to fix conan package build
Clueliss May 11, 2023
b5b8005
add hash overload for spans
Clueliss May 11, 2023
66b546c
force std::byte into fundamental overload
Clueliss May 11, 2023
b8cea19
misc improvements and fixes for allocators
Clueliss May 11, 2023
c8ea2e9
inline data into lthash, remove allocator
Clueliss May 12, 2023
d686b34
remove requirement for element number
Clueliss May 12, 2023
c0756e1
add lthash-16-512 to tests and improve test setup
Clueliss May 12, 2023
adb37d9
remove redundant configure call
Clueliss May 12, 2023
dc7de74
add partial constexpr support
Clueliss May 15, 2023
76a81a4
add flag to optimize storage space
Clueliss May 16, 2023
e982340
switch from required to static_assert, add friend, add tests
Clueliss May 16, 2023
0caae72
use unaligned avx instructions to save memory and fix deploy
Clueliss Aug 8, 2023
5daa602
maybe default options
Clueliss Aug 8, 2023
9ae7225
attempt to fix deploy
Clueliss Aug 8, 2023
3a26daa
remove dockerfile
Clueliss Aug 8, 2023
0f4fe67
add blake3
Clueliss Oct 4, 2023
e267471
attempt to fix build
Clueliss Oct 4, 2023
fbedd63
attempt to fix build
liss-h Dec 13, 2023
9a4f313
try fix build
liss-h Dec 13, 2023
51bd850
workflow
liss-h Dec 13, 2023
e547a97
other workflow
liss-h Dec 13, 2023
0bf7865
try again
liss-h Dec 13, 2023
fb7c45c
typo
liss-h Dec 13, 2023
8e488cc
try again
liss-h Dec 13, 2023
1a31f63
another try
liss-h Dec 13, 2023
a8b721e
use ctest
liss-h Dec 13, 2023
1190e38
exclude benchmarks
liss-h Dec 13, 2023
d0088a0
merge develop
liss-h Mar 25, 2024
0f9683b
fix testing
liss-h Mar 25, 2024
216ac87
Merge branch 'refs/heads/develop' into feature/lthash
mcb5637 Aug 5, 2024
95fa6a1
fixes
mcb5637 Aug 5, 2024
22fc7da
more fixes
mcb5637 Aug 5, 2024
4e0c633
fix test_package
mcb5637 Aug 5, 2024
c815333
Merge branch 'develop' into feature/lthash
mcb5637 Aug 28, 2024
b8fcf1e
wip highway math engine
mcb5637 Aug 28, 2024
917bc0a
hwy lthash
mcb5637 Aug 29, 2024
ba51c12
highway conanfile fix
mcb5637 Aug 29, 2024
8654439
fix build
mcb5637 Sep 2, 2024
0c6b853
Merge branch 'develop' into feature/lthash
mcb5637 Sep 2, 2024
51b5f62
skip benchmarks in ci
mcb5637 Sep 2, 2024
bc403e9
fix regex
mcb5637 Sep 2, 2024
a780a1b
with_blake
mcb5637 Sep 3, 2024
abd8b8e
fixes
mcb5637 Sep 3, 2024
3a7c817
update readme
mcb5637 Sep 10, 2024
6a1aa8a
fix build without sodium/highway
mcb5637 Sep 10, 2024
d0e0e7f
remove old blake3 files
mcb5637 Sep 19, 2024
afeff8f
blake3 download from cmake
mcb5637 Sep 19, 2024
b6eb95f
fix download
mcb5637 Sep 19, 2024
84a3f21
move files
mcb5637 Sep 23, 2024
dffcac2
copy licenses
mcb5637 Sep 23, 2024
abb701a
fix license paths
mcb5637 Sep 24, 2024
d472d8e
fix build, test_package, readme
mcb5637 Oct 2, 2024
9ae16c7
add cmake comment
mcb5637 Oct 2, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 2 additions & 2 deletions .github/workflows/code_testing.yml
Original file line number Diff line number Diff line change
Expand Up @@ -62,7 +62,7 @@ jobs:
uses: dice-group/cpp-conan-release-reusable-workflow/.github/actions/add_conan_provider@main

- name: Configure CMake
run: cmake -DCMAKE_BUILD_TYPE=Debug -DBUILD_TESTING=On -DBUILD_EXAMPLES=On -DCMAKE_PROJECT_TOP_LEVEL_INCLUDES=conan_provider.cmake -G Ninja -B build .
run: cmake -DCMAKE_BUILD_TYPE=Debug -DBUILD_TESTING=On -DWITH_BLAKE=On -DBUILD_EXAMPLES=On -DCMAKE_PROJECT_TOP_LEVEL_INCLUDES=conan_provider.cmake -G Ninja -B build .
env:
CC: ${{ steps.install_cc.outputs.cc }}
CXX: ${{ steps.install_cc.outputs.cxx }}
Expand All @@ -74,7 +74,7 @@ jobs:

- name: Run tests
working-directory: build
run: ctest --verbose -j2
run: ctest --verbose -j2 -E ".*Benchmark.*"

- name: Run examples
working-directory: build
Expand Down
4 changes: 2 additions & 2 deletions .github/workflows/detect-pobr-diff.yml
Original file line number Diff line number Diff line change
Expand Up @@ -16,8 +16,8 @@ jobs:
conan-version: 2.3.1
base-branch: ${{ github.base_ref }}
search-path: >
include/dice/hash/internal
abi-version-header: include/dice/hash/version.hpp
src/dice/hash/internal
abi-version-header: src/dice/hash/version.hpp
abi-version-const: dice::hash::pobr_version
secrets:
CONAN_USER: ""
Expand Down
2 changes: 1 addition & 1 deletion .github/workflows/publish-release.yml
Original file line number Diff line number Diff line change
Expand Up @@ -17,7 +17,7 @@ jobs:
with:
public_artifactory: true
os: ubuntu-22.04
compiler: clang-14
compiler: clang-17
cmake-version: 3.24.0
conan-version: 2.3.0
secrets:
Expand Down
3 changes: 2 additions & 1 deletion .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -163,4 +163,5 @@ modules.xml

test_package/build
conan_provider.cmake
include/dice/hash/version.hpp
/src/dice/hash/version.hpp
/src/dice/hash/blake/internal/blake3/libblake3.pc
51 changes: 46 additions & 5 deletions CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,9 @@ project(
HOMEPAGE_URL "https://dice-group.github.io/dice-hash/")
set(POBR_VERSION 1) # Persisted Object Binary Representation Version

include(cmake/boilerplate_init.cmake)
boilerplate_init()

# set gcc-10 and clang-10 as minimum versions see
# https://stackoverflow.com/questions/14933172/how-can-i-add-a-minimum-compiler-version-requisite#14934542
set(MIN_COMPILER_VERSION_GCC "10.0.0")
Expand All @@ -25,23 +28,61 @@ else()
MESSAGE(WARNING "Could not verify that your compiler (${CMAKE_CXX_COMPILER}) supports all needed features.")
endif()

configure_file(${CMAKE_CURRENT_SOURCE_DIR}/cmake/version.hpp.in ${CMAKE_CURRENT_SOURCE_DIR}/include/dice/hash/version.hpp)
configure_file(${CMAKE_CURRENT_SOURCE_DIR}/cmake/version.hpp.in ${CMAKE_CURRENT_SOURCE_DIR}/src/dice/hash/version.hpp)

OPTION(WITH_BLAKE "Enable usage of the external library sodium for Blake2b, Blake2Xb and LtHash support" OFF)

if (PROJECT_IS_TOP_LEVEL)
if (BUILD_TESTING)
set(CONAN_INSTALL_ARGS "${CONAN_INSTALL_ARGS};-o=&:with_test_deps=True")
endif ()
if (WITH_BLAKE)
set(CONAN_INSTALL_ARGS "${CONAN_INSTALL_ARGS};-o=&:with_blake=True")
endif ()
endif ()

add_library(${PROJECT_NAME} INTERFACE)
if (WITH_BLAKE)
find_package(sodium QUIET) # try canonical name first
set(sodium_dep "sodium")
if (NOT sodium_FOUND)
find_package(libsodium QUIET) # next try the name used by conan
set(sodium_dep "libsodium::libsodium")
if (NOT libsodium_FOUND)
include(cmake/FindSodium.cmake) # finally try the FindSodium.cmake provided with this repo
set(sodium_dep "sodium")
if (NOT Sodium_FOUND)
message(FATAL_ERROR "Sodium was not found.")
endif ()
endif ()
endif ()

find_package(highway REQUIRED)
endif ()

add_subdirectory(src/dice/hash/blake/internal/blake3)

add_library(${PROJECT_NAME}
src/dice/hash/lthash/MathEngine_Hwy.cpp
)
add_library(${PROJECT_NAME}::${PROJECT_NAME} ALIAS ${PROJECT_NAME})

target_link_libraries(${PROJECT_NAME}
INTERFACE blake3
)

target_include_directories(
${PROJECT_NAME}
INTERFACE $<BUILD_INTERFACE:${CMAKE_CURRENT_SOURCE_DIR}/include>)
PUBLIC $<BUILD_INTERFACE:${CMAKE_CURRENT_SOURCE_DIR}/src>)

if (WITH_BLAKE)
target_link_libraries(${PROJECT_NAME}
INTERFACE ${sodium_dep}
PRIVATE highway::highway
)
endif ()

include(cmake/install_interface_library.cmake)
install_interface_library(${PROJECT_NAME} "include")
include(cmake/install_library.cmake)
install_cpp_library(${PROJECT_NAME} "src")

if(PROJECT_IS_TOP_LEVEL AND BUILD_TESTING)
include(CTest)
Expand Down
58 changes: 51 additions & 7 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,17 +7,19 @@ dice-hash provides a framework to generate stable hashes. It provides state-of-t
- [wyhash](https://github.com/wangyi-fudan/wyhash)
- "martinus", the internal hash function from [robin-hood-hashing](https://github.com/martinus/robin-hood-hashing)

These three, additional, general purpose hash functions are also (optionally) provided
- [Blake2b](https://www.blake2.net)
- [Blake2Xb](https://www.blake2.net/blake2x.pdf)
- [LtHash](https://engineering.fb.com/2019/03/01/security/homomorphic-hashing)

**📦 STL out of the box:** dice-hash supports many common STL types already:
arithmetic types like `bool`, `int`, `double`, ... etc.; collections like `std::unordered_map/set`, `std::map/set`, `std::vector`, `std::tuple`, `std::pair`, `std::optional`, `std::variant`, `std::array` and; all combinations of them.

**🔩 extensible:** dice-hash supports you with helper functions to define hashes for your own classes. Checkout [usage](#usage).



**🔩 extensible:** dice-hash supports you with helper functions to define hashes for your own classes. Checkout [usage](#usage).

## Requirements

A C++20 compatible compiler. Code was only tested on x86_64.
- A C++20 compatible compiler. Code was only tested on x86_64.
- If you want to use [Blake2b](https://www.blake2.net), [Blake2Xb](https://www.blake2.net/blake2x.pdf) or [LtHash](https://engineering.fb.com/2019/03/01/security/homomorphic-hashing): [libsodium](https://doc.libsodium.org/) (either using conan or a local system installation) (for more details scroll down to "Usage for general data hashing")

## Include it into your projects

Expand Down Expand Up @@ -55,13 +57,14 @@ make -j tests_dice_hash
Note: This example uses conan as dependency provider, other providers are possible.
See https://cmake.org/cmake/help/latest/guide/using-dependencies/index.html#dependency-providers

## usage
## Usage for C++ container hashing
You need to include a single header:
```c++
#include <dice/hash.hpp>
```

The hash is already defined for a lot of common types. In that case you can use the `DiceHash` just like `std::hash`.
This means these hashes return `size_t`, if you need larger hashes skip to the section below.
```c++
dice::hash::DiceHash<int> hash;
hash(42);
Expand Down Expand Up @@ -104,3 +107,44 @@ One simple example can be found [here](examples/customContainer.cpp).

If you want to use `DiceHash` in a different structure (like `std::unordered_map`), you will need to set `DiceHash` as the correct template parameter.
[This](examples/usageForUnorderedSet.cpp) is one example.

## Usage for general data hashing
**The hash functions mentioned in this section are enabled/disabled using the feature flag `WITH_SODIUM=ON/OFF`.**
**Enabling this flag (default behaviour) results in [libsodium](https://doc.libsodium.org/) being required as a dependency.**
**If using conan, [libsodium](https://doc.libsodium.org/) will be fetched using conan, otherwise dice-hash will look for a local system installation.**

The hashes mentioned here are not meant to be used in C++ containers as they do _not_ return `size_t`.
They are instead meant as general hashing functions for arbitrary data.

### [Blake2b](https://www.blake2.net/) - ["fast secure hashing"](https://www.blake2.net/) (with output sizes from 16 bytes up to 64 bytes)
["BLAKE2 is a cryptographic hash function faster than MD5, SHA-1, SHA-2, and SHA-3, yet is at least as secure as the latest standard SHA-3."](https://www.blake2.net/)

To use it you need to include
```c++
#include <dice/hash/blake2/Blake2b.hpp>
```
For a usage examples see: [examples/blake2b.cpp](examples/blake2b.cpp).

### [Blake2Xb](https://www.blake2.net/blake2x.pdf) - arbitrary length hashing based on [Blake2b](https://www.blake2.net/)
Blake2Xb is a hash function that produces hashes of arbitrary length.

To use it you need to include
```c++
#include <dice/hash/blake2/Blake2xb.hpp>
```
For a usage examples see: [examples/blake2xb.cpp](examples/blake2xb.cpp).

### [LtHash](https://engineering.fb.com/2019/03/01/security/homomorphic-hashing/) - homomorphic/multiset hashing
LtHash is a multiset/homomorphic hash function, meaning, instead of working on streams of data, it digests
individual "objects". This means you can add and remove "objects" to/from an `LtHash` (object by object)
as if it were a multiset and then read the hash that would result from hashing that multiset.

Small non-code example that shows the basic principle:
> LtHash({apple}) + LtHash({banana}) - LtHash({peach}) + LtHash({banana}) = LtHash({apple<sup>1</sup>, banana<sup>2</sup>, peach<sup>-1</sup>})

To use it you need to include
```c++
#include <dice/hash/lthash/LtHash.hpp>
// automatically includes <dice/hash/blake/Blake3.hpp>
```
mcb5637 marked this conversation as resolved.
Show resolved Hide resolved
For a usage example see [examples/ltHash.cpp](examples/ltHash.cpp).
Loading
Loading