Refactor: AlignmentBuffer<> helper for block-oriented Hashes #3693

reneme · 2023-09-12T09:36:02Z

Introduction and Motivation

As promised I am looking into a way to simplify the alignment code in block-oriented algorithms. To my mind, there is quite a bit of code duplication in many implementations. Also, its fairly critical code that handles raw memory, offsets and internal state that could easily go wrong.

In fact, I even found an issue in the existing code, while working on this. Below is the ::add_data() implementation for MDx_HashFunction in Botan 3.1.1. Latest master is slightly different but contains the same issue.

Let's assume this is called twice on a pristine instance of SHA-512 (block size 128 bytes):

add_data({...}, 1)
add_data({...}, 126)

The first call fills m_buffer with one byte and sets m_position = 1. Both is happening at buffer_insert at the very end of the function (because m_position was initially 0).
In the second call we append an additional 126 bytes to m_buffer (in the first buffer_insert(), as m_position is initially 1). The block still isn't full, so no compress_n() is invoked.
The issue: We again set m_position = 127 only at the very end after the second buffer_insert (that copies the 126 bytes once more into the same buffer location). The offset calculations check out, so there's no functional problem. But frankly, to me this doesn't look like intentional control flow.

Side note: one needs to know that buffer_insert() "fills m_buffer with as many bytes from input as possible, but at most m_buffer.size() - m_position".

void MDx_HashFunction::add_data(const uint8_t input[], size_t length) {
   const size_t block_len = static_cast<size_t>(1) << m_block_bits;

   m_count += length;

   if(m_position) {
      buffer_insert(m_buffer, m_position, input, length);

      if(m_position + length >= block_len) {
         compress_n(m_buffer.data(), 1);
         input += (block_len - m_position);
         length -= (block_len - m_position);
         m_position = 0;
      }
   }

   // Just in case the compiler can't figure out block_len is a power of 2
   const size_t full_blocks = length >> m_block_bits;
   const size_t remaining = length & (block_len - 1);

   if(full_blocks > 0) {
      compress_n(input, full_blocks);
   }

   buffer_insert(m_buffer, m_position, input + full_blocks * block_len, remaining);
   m_position += remaining;
}

Description of Proposed Improvement

This is an attempt on encapsulating the intermittent buffering into an opinionated helper class with a (hopefully) clear and minimal interface. @randombit I'd be grateful for feedback early-on. I'll describe the details of the helper class' API in Doxygen comments.

With the helper, MDx_HashFunction::add_data() would roughly look like that and not pose the issue described above:

void MDx_HashFunction::add_data(std::span<const uint8_t> input) {
   BufferSlicer in(input);

   while(!in.empty()) {
      if(const auto one_block = m_buffer.handle_unaligned_data(in)) {
         compress_n(one_block->data(), 1);
      }

      if(m_buffer.in_alignment()) {
         const auto [aligned_data, full_blocks] = m_buffer.aligned_data_to_process(in);
         if(full_blocks > 0) {
            compress_n(aligned_data.data(), full_blocks);
         }
      }
   }

   m_count += input.size();
}

A quick trail benchmark run (average of 4 runs ./botan speed SHA-1) :

Branch	MiB/s
this1	1830
master	1829

coveralls · 2023-09-12T10:25:24Z

coverage: 91.756% (+0.03%) from 91.726% when pulling 66438aa on Rohde-Schwarz:refactor/alignment_buffer into a8d4d18 on randombit:master.

randombit · 2023-09-12T18:17:48Z

Haven't reviewed this carefully yet but the broad idea looks ok to me

reneme · 2023-09-12T18:29:44Z

Haven't reviewed this carefully yet but the broad idea looks ok to me

That'll do for now, thanks. I'll push this forward tomorrow.

reneme · 2023-09-13T09:51:05Z

This converged nicely for all hash implementations. The HashFunction::add_data() implementations are now basically all equivalent and unifying that into a single base implementation (using an abstract compress() method) should be easy to do.

I'll add more unit tests and documentation for AlignmentBuffer<> after lunch and will remove the "Draft" status afterwards.

Future Work

Applying this to the hash implementations is probably enough complexity for this pull request. Other places (e.g. block cipher modes, MACs) might benefit from this refactoring as well, though.

The Keccak permutation state may or may not benefit from AlignmentBuffer<> as well. Though, with the Keccak state not being a simple byte buffer, there's more work to be done.

@randombit Are there more places?

randombit

This is nice, thanks. I left a few nit comments plus some ideas for possible improvements, ptal.

src/lib/hash/blake2/blake2b.cpp

src/lib/utils/alignment_buffer.h

randombit · 2023-09-13T14:14:11Z

src/lib/utils/alignment_buffer.h

+ * @tparam T  the container type to use for the alignment buffer
+ */
+template <concepts::contiguous_container T>
+class AlignmentBuffer {


WDYT about making defer_final_block a template parameter? Since I don't think we do (or would ever) have a use case where you would toggle between them at runtime.

Similarly, but perhaps more contentiously, if we made the buffer size a template parameter then

a) we might gain some minimal performance improvements due to the the block size being known to be fixed, eg when dividing by the buffer size.
b) (more importantly) we could skip using a container entirely and store the buffer in a simple array, avoiding the heap allocation.

This would be less flexible though, which might matter for some future case where we are doing runtime dispatch of the specific algorithm based on hardware support. Eg we want to align to 16 bytes if running regular C++ but 128 bytes if we're doing something with SIMD.

Also it would require exposing the internal block size in the headers of the various algorithms, but these headers are now internal only anyway so this doesn't seem so bad.

I'm certainly in favor of making things more "constexpr" here. Both the block size and the defer_final_block make sense as template parameters. Also I've been tinkering with a template-customizable append() function (to allow using this in Keccak where -- among other things -- we'd need to XOR instead of "overwrite"). The latter is certainly future work, though.

Regarding std::array<>: I had that in mind, but didn't go for it because we can't use the mlocked allocator for stack allocations. Is that something we can (and want to) live with?

I don't think the mlock aspect is that critical tbh. We spray intermediate values everywhere on the stack in the course of computing anything, and it's pretty unavoidable that someone with access to a core dump or swap file is going to be able to recover information.

If we go the array route should in any case scrub the memory on clear or destructor; that already puts us ahead of the game compared to most implementations I've seen.

Fair enough. I'll look into that tomorrow.

I applied both suggestions, making defer_final_block and the block_size into template parameters. The latter proved problematic for MDx_HashFunction as this info is currently a runtime information (behind a virtual method). Once #3550 is finished, that shouldn't be a problem anymore, though.

src/lib/utils/alignment_buffer.h

reneme · 2023-09-13T16:11:46Z

Thanks for the swift review! Re:BOTAN_STATE_CHECK(): I changed the asserts (and also used BOTAN_ARG_CHECK where applicable). But, is that really correct? This class is internal and tightly coupled with the buffered computation use-case. If any of those ever throw, it is because we made a programming mistake in the consumer implementations. No particularly hard feelings about this, though, just double-checking.

randombit · 2023-09-13T16:27:37Z

Hmm you have a point there. In a nominal sense Invalid_State is the correct throw ("A request was made on an object while the object was in a state where the operation cannot be performed.") but you're right that since this is an internal-only API it throwing implies a bug elsewhere within the library, rather than some usage error by the application.

reneme · 2023-09-13T17:17:18Z

since this is an internal-only API it throwing implies a bug elsewhere within the library, rather than some usage error by the application.

I read that as "let's revert 5a3ff50", right?

This commit contains a few additional tests for the state checks. I'm strongly assuming that it is not sensible to try and check asserts in the unittests, though. Particularly after introducing ./configure.py --terminate-on-assert. 😅

randombit · 2023-09-13T18:18:07Z

Yeah let's revert that, sorry for the confusion.

src/lib/utils/alignment_buffer.h

src/lib/hash/blake2/blake2b.h

randombit

Looks really good - left a couple of nitpicky comments

src/lib/hash/skein/skein_512.h

src/lib/utils/alignment_buffer.h

reneme · 2023-09-17T19:00:11Z

For the record: I pushed another change after your approval (documenting the enum class AlignmentBufferFinalBlock). Please check and merge at your convenience.

randombit

Thanks

reneme force-pushed the refactor/alignment_buffer branch from 4135030 to cd9f6b6 Compare September 12, 2023 09:47

reneme force-pushed the refactor/alignment_buffer branch 2 times, most recently from 53a1765 to 71d6866 Compare September 12, 2023 11:17

reneme force-pushed the refactor/alignment_buffer branch 2 times, most recently from 1b7434e to 028853c Compare September 13, 2023 09:37

reneme force-pushed the refactor/alignment_buffer branch 2 times, most recently from b5c4340 to f8dfdc3 Compare September 13, 2023 12:48

reneme marked this pull request as ready for review September 13, 2023 12:48

reneme force-pushed the refactor/alignment_buffer branch 3 times, most recently from 59d13ac to 6edaf07 Compare September 13, 2023 13:29

reneme mentioned this pull request Sep 13, 2023

Remove MDx_HashFunction #3550

Closed

reneme changed the title ~~Refactor: AlignmentBuffer<> helper for block-oriented Algorithms~~ Refactor: AlignmentBuffer<> helper for block-oriented Hashes Sep 13, 2023

reneme requested a review from randombit September 13, 2023 13:40

randombit reviewed Sep 13, 2023

View reviewed changes

reneme force-pushed the refactor/alignment_buffer branch from 6edaf07 to 5a3ff50 Compare September 13, 2023 16:07

reneme force-pushed the refactor/alignment_buffer branch 3 times, most recently from 8e70dfc to f28bffe Compare September 14, 2023 15:52

randombit reviewed Sep 14, 2023

View reviewed changes

src/lib/utils/alignment_buffer.h Show resolved Hide resolved

src/lib/hash/blake2/blake2b.h Outdated Show resolved Hide resolved

randombit reviewed Sep 17, 2023

View reviewed changes

src/lib/hash/skein/skein_512.h Outdated Show resolved Hide resolved

src/lib/utils/alignment_buffer.h Outdated Show resolved Hide resolved

reneme force-pushed the refactor/alignment_buffer branch from 4d7f083 to 66bd11a Compare September 17, 2023 17:46

randombit approved these changes Sep 17, 2023

View reviewed changes

reneme added 6 commits September 17, 2023 20:50

Add an AlignmentBuffer<> helper

ea815b3

Use AlignmentBuffer<> for GOST_3411

1ea5c69

Use AlignmentBuffer<> for Skein

08fbb5c

Use AlignmentBuffer<> for Streebog

62bd887

Use AlignmentBuffer<> for Blake2b

b185737

add TODO in MDx_HashFunction

66438aa

reneme force-pushed the refactor/alignment_buffer branch from 7738948 to 66438aa Compare September 17, 2023 18:50

reneme mentioned this pull request Sep 18, 2023

Refactor: Remove MDx_HashFunction #3705

Merged

randombit approved these changes Sep 19, 2023

View reviewed changes

randombit merged commit 1f26e75 into randombit:master Sep 19, 2023
38 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Refactor: AlignmentBuffer<> helper for block-oriented Hashes #3693

Refactor: AlignmentBuffer<> helper for block-oriented Hashes #3693

reneme commented Sep 12, 2023 •

edited

Loading

coveralls commented Sep 12, 2023 •

edited

Loading

randombit commented Sep 12, 2023

reneme commented Sep 12, 2023

reneme commented Sep 13, 2023 •

edited

Loading

randombit left a comment

randombit Sep 13, 2023

randombit Sep 13, 2023

reneme Sep 13, 2023

randombit Sep 13, 2023

reneme Sep 13, 2023

randombit Sep 13, 2023

reneme Sep 14, 2023

reneme commented Sep 13, 2023

randombit commented Sep 13, 2023

reneme commented Sep 13, 2023

randombit commented Sep 13, 2023

randombit left a comment

reneme commented Sep 17, 2023

randombit left a comment

Refactor: AlignmentBuffer<> helper for block-oriented Hashes #3693

Refactor: AlignmentBuffer<> helper for block-oriented Hashes #3693

Conversation

reneme commented Sep 12, 2023 • edited Loading

Introduction and Motivation

Description of Proposed Improvement

coveralls commented Sep 12, 2023 • edited Loading

randombit commented Sep 12, 2023

reneme commented Sep 12, 2023

reneme commented Sep 13, 2023 • edited Loading

Future Work

randombit left a comment

Choose a reason for hiding this comment

randombit Sep 13, 2023

Choose a reason for hiding this comment

randombit Sep 13, 2023

Choose a reason for hiding this comment

reneme Sep 13, 2023

Choose a reason for hiding this comment

randombit Sep 13, 2023

Choose a reason for hiding this comment

reneme Sep 13, 2023

Choose a reason for hiding this comment

randombit Sep 13, 2023

Choose a reason for hiding this comment

reneme Sep 14, 2023

Choose a reason for hiding this comment

reneme commented Sep 13, 2023

randombit commented Sep 13, 2023

reneme commented Sep 13, 2023

randombit commented Sep 13, 2023

randombit left a comment

Choose a reason for hiding this comment

reneme commented Sep 17, 2023

randombit left a comment

Choose a reason for hiding this comment

reneme commented Sep 12, 2023 •

edited

Loading

coveralls commented Sep 12, 2023 •

edited

Loading

reneme commented Sep 13, 2023 •

edited

Loading