Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[std::span] {load,store}_{le,be} #3707

Merged
merged 7 commits into from
Dec 22, 2023
Merged

Conversation

reneme
Copy link
Collaborator

@reneme reneme commented Sep 19, 2023

Pull Request Dependencies

Description

This has two objectives:

  1. range-based overloads for big/little endian helpers
    The lack of this has been annoying me for some time now. Using ranges facilitates maximum usage convenience while enabling opportunistic compile-time buffer size checks when the passed range has static length information (read: std::array or std::span<T, 42>). Otherwise runtime checks will be performed (BOTAN_ARG_CHECK).
  2. Reduce the overload complexity
    ... by letting the compiler generate most of the generic code as constexpr. I didn't see any slow-downs (in the runtime of the test suite) after applying this.

Here's a godbolt (of a somewhat stripped-down version) to play around with that, if needed: https://godbolt.org/z/sxjhnbePo

Currently, this includes legacy (pointer-based) overloads of the existing helpers. Those just assert that the passed in pointers are big enough and wrap them into properly sized std::spans to forward the calls. In an ideal world we'd need to go through the code base and replace all those calls with statically checkable invocations.

@coveralls
Copy link

coveralls commented Sep 19, 2023

Coverage Status

coverage: 92.053% (-0.006%) from 92.059%
when pulling 5e52e4b on Rohde-Schwarz:span/loadstor
into 3005ae6 on randombit:master.

@reneme reneme force-pushed the span/loadstor branch 2 times, most recently from 340e1c9 to c71ad68 Compare September 20, 2023 07:34
src/lib/utils/ghash/ghash.cpp Outdated Show resolved Hide resolved
@reneme reneme marked this pull request as ready for review September 20, 2023 09:20
Copy link
Owner

@randombit randombit left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In general this looks like a nice improvement, main concern would be if there are any performance implications. I would think not for the case where endian is known at compile time so we can make use of memcpy, and the targets where it's not known (or anyway assumed) are all relatively obscure, but I'd like to not introduce a regression if we don't have to.

src/lib/utils/mem_ops.h Outdated Show resolved Hide resolved
@reneme
Copy link
Collaborator Author

reneme commented Sep 25, 2023

Rebased after #3711 fixed the CI build.

@reneme reneme force-pushed the span/loadstor branch 2 times, most recently from a64e90b to bd1a6df Compare September 26, 2023 08:48
@reneme reneme added this to the Botan 3.3.0 milestone Sep 27, 2023
@reneme
Copy link
Collaborator Author

reneme commented Sep 27, 2023

Lets postpone this for a moment. I'm hoping that the iOS/Android update will bring more tools to make this easier.

@randombit
Copy link
Owner

I'm going to either update the emscripten build (#3720) or just disable the emscripten CI build, at which point we should be able to rely on having most of C++20.

[*] Outside of std::source_location, and probably 20 other things that we just haven't hit yet...

Comment on lines -180 to +187
uint8_t final_block[GCM_BS];
store_be<uint64_t>(final_block, 8 * ad_len, 8 * text_len);
ghash_update(hash, {final_block, GCM_BS});
std::array<uint8_t, GCM_BS> final_block;

const uint64_t ad_bits = 8 * ad_len;
const uint64_t text_bits = 8 * text_len;
store_be(final_block, ad_bits, text_bits);
ghash_update(hash, final_block);
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This patch was needed to explicitly convert size_t to uint64_t on 32-bit platforms.

@reneme
Copy link
Collaborator Author

reneme commented Oct 18, 2023

Rebased after #3760 conflicted with the underlying #3715.

@reneme
Copy link
Collaborator Author

reneme commented Oct 31, 2023

This will need a rebase. But let's review/merge #3715 first and get this done after.

@FAlbertDev FAlbertDev mentioned this pull request Nov 16, 2023
@FAlbertDev
Copy link
Collaborator

I'll review this PR in the following days!

Copy link
Collaborator

@FAlbertDev FAlbertDev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice! I'm looking forward to using these new interfaces! This will beautify many places, where load/store is used 😃

src/lib/utils/concepts.h Show resolved Hide resolved
src/lib/utils/loadstor.h Show resolved Hide resolved
src/lib/utils/loadstor.h Show resolved Hide resolved
src/lib/utils/loadstor.h Outdated Show resolved Hide resolved
@reneme
Copy link
Collaborator Author

reneme commented Nov 21, 2023

@FAlbertDev @randombit I rebased to master, and addressed the pending review comments (except the docstring remarks).

Request for Comment: Type inferring overloads for load_* and store_*

Explicitly note that I added a commit (b4c057e) that introduces convenience overloads for load_* and store_*, which infer their output type at compile time, if possible. Please hear me out:

I am envisioning those as freestanding helpers for all our byte-stream based APIs (most notably: Buffered_Computation, XOF, BufferSlicer, BufferStuffer, ...). We discussed introducing endian-conversion overloads on several occasions: e.g. for LMS, but also when introducing the XOF API). Instead of scattering endian-conversion overloads in all of such APIs, we could just compose them with those "smart" freestanding functions.

The resulting downstream code, would then look something along those lines:

void parse_some_tls_structure(std::span<const uint8_t> serialized) {
   BufferSlicer bs(serialized);
   const uint16_t tag = load_le(bs.take<2>());  // we could even use `auto` here...
   const uint16_t length = load_le(bs.take<2>());
   // ...
}

std::unique_ptr<XOF> prepare_xof_for_some_pqc_algo(std::span<const uint8_t> seed, uint32_t nonce) {
   auto xof = XOF::create_or_throw("SHAKE-256");
   xof->update(seed);
   xof->update(store_be(nonce)); // instead of adding an overload like `XOF::update_be()`
   return xof;
}

Comparison: without the type inference overloads

Without this load_*, we would need to explicitly declare the integer type (which is redundant, if the byte array length is known at compile time). Admittedly, that's a bit of template kink, though.

// assuming the BufferSlicer example
const uint16_t tag = load_le<uint16_t>(bs.take<2>());
//                           ~~~~~~~~

Similarly, with store_*, we avoid an explicit definition of a bytearray as sink, like so:

// assuming the XOF example
std::array<uint8_t, sizeof(nonce)> sink;
store_be(nonce, sink);
xof->update(sink);

I am arguing that these overloads provide maximum flexibility and convenience (while being statically checked for consistency by the compiler, as much as possible). And they allow easy composability with all modern (span or range-based) interfaces.

Copy link
Collaborator

@FAlbertDev FAlbertDev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I like the static return type deduction! We could discuss if we still want to add load/store members to the BufferSlicer and BufferStuffer. For now, however, I think this PR is ready to merge.

src/lib/utils/loadstor.h Outdated Show resolved Hide resolved
Co-authored-by: Fabian Albert <126883116+FAlbertDev@users.noreply.github.com>
@reneme
Copy link
Collaborator Author

reneme commented Nov 22, 2023

Thanks for the thorough review! @randombit no particular rush with this, but I'd love to have it in (after discussing #3707 (comment)) to get Frodo and LMS adapted to this. Kyber and Dilithium will follow.

@randombit
Copy link
Owner

Change itself looks fine, and a nice cleanup overall.

I do see a minor but persistent performance regression with Blowfish (about .2 cycles/byte slower) but for whatever reason that's the only algorithm I can find that seems slower, so not a problem.

src/lib/utils/concepts.h Show resolved Hide resolved
src/lib/utils/concepts.h Show resolved Hide resolved
@reneme
Copy link
Collaborator Author

reneme commented Dec 22, 2023

🎉

@reneme reneme merged commit 36c2e9e into randombit:master Dec 22, 2023
38 checks passed
@reneme reneme deleted the span/loadstor branch December 22, 2023 14:20
@reneme reneme mentioned this pull request Jan 2, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants