Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

build: minor fix for 32-bit archs #11726

Merged
merged 9 commits into from
Jul 9, 2020
Merged
Show file tree
Hide file tree
Changes from 4 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
34 changes: 16 additions & 18 deletions source/common/stats/symbol_table_impl.cc
Original file line number Diff line number Diff line change
Expand Up @@ -31,7 +31,7 @@ static constexpr uint32_t Low7Bits = 0x7f;
static constexpr Symbol FirstValidSymbol = 1;
static constexpr uint8_t LiteralStringIndicator = 0;

uint64_t StatName::dataSize() const {
size_t StatName::dataSize() const {
if (size_and_data_ == nullptr) {
return 0;
}
Expand All @@ -46,9 +46,9 @@ void StatName::debugPrint() {
if (size_and_data_ == nullptr) {
std::cerr << "Null StatName" << std::endl;
} else {
const uint64_t nbytes = dataSize();
const size_t nbytes = dataSize();
std::cerr << "dataSize=" << nbytes << ":";
for (uint64_t i = 0; i < nbytes; ++i) {
for (size_t i = 0; i < nbytes; ++i) {
std::cerr << " " << static_cast<uint64_t>(data()[i]);
}
const SymbolVec encoding = SymbolTableImpl::Encoding::decodeSymbols(data(), dataSize());
Expand All @@ -67,8 +67,8 @@ SymbolTableImpl::Encoding::~Encoding() {
ASSERT(mem_block_.capacity() == 0);
}

uint64_t SymbolTableImpl::Encoding::encodingSizeBytes(uint64_t number) {
uint64_t num_bytes = 0;
size_t SymbolTableImpl::Encoding::encodingSizeBytes(uint64_t number) {
size_t num_bytes = 0;
do {
++num_bytes;
number >>= 7;
Expand Down Expand Up @@ -106,7 +106,7 @@ void SymbolTableImpl::Encoding::addSymbols(const std::vector<Symbol>& symbols) {
}
}

std::pair<uint64_t, uint64_t> SymbolTableImpl::Encoding::decodeNumber(const uint8_t* encoding) {
std::pair<uint64_t, size_t> SymbolTableImpl::Encoding::decodeNumber(const uint8_t* encoding) {
uint64_t number = 0;
uint64_t uc = SpilloverMask;
const uint8_t* start = encoding;
Expand All @@ -117,8 +117,7 @@ std::pair<uint64_t, uint64_t> SymbolTableImpl::Encoding::decodeNumber(const uint
return std::make_pair(number, encoding - start);
}

SymbolVec SymbolTableImpl::Encoding::decodeSymbols(const SymbolTable::Storage array,
uint64_t size) {
SymbolVec SymbolTableImpl::Encoding::decodeSymbols(const SymbolTable::Storage array, size_t size) {
SymbolVec symbol_vec;
symbol_vec.reserve(size);
decodeTokens(
Expand All @@ -128,8 +127,7 @@ SymbolVec SymbolTableImpl::Encoding::decodeSymbols(const SymbolTable::Storage ar
}

void SymbolTableImpl::Encoding::decodeTokens(
const SymbolTable::Storage array, uint64_t size,
const std::function<void(Symbol)>& symbolTokenFn,
const SymbolTable::Storage array, size_t size, const std::function<void(Symbol)>& symbolTokenFn,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not your fault, but clang-tidy is dinging you for a pre-existing naming mistake here.

s/symbolTokenFn/symbol_token_fn/ here and in the header. Sorry about that.

That will help you pass CI.

const std::function<void(absl::string_view)>& stringViewTokenFn) {
while (size > 0) {
if (*array == LiteralStringIndicator) {
Expand All @@ -138,7 +136,7 @@ void SymbolTableImpl::Encoding::decodeTokens(
ASSERT(size > 1);
++array;
--size;
std::pair<uint64_t, uint64_t> length_consumed = decodeNumber(array);
std::pair<uint64_t, size_t> length_consumed = decodeNumber(array);
uint64_t length = length_consumed.first;
array += length_consumed.second;
size -= length_consumed.second;
Expand All @@ -147,7 +145,7 @@ void SymbolTableImpl::Encoding::decodeTokens(
size -= length;
array += length;
} else {
std::pair<uint64_t, uint64_t> symbol_consumed = decodeNumber(array);
std::pair<uint64_t, size_t> symbol_consumed = decodeNumber(array);
symbolTokenFn(symbol_consumed.first);
size -= symbol_consumed.second;
array += symbol_consumed.second;
Expand All @@ -156,7 +154,7 @@ void SymbolTableImpl::Encoding::decodeTokens(
}

std::vector<absl::string_view> SymbolTableImpl::decodeStrings(const SymbolTable::Storage array,
uint64_t size) const {
size_t size) const {
std::vector<absl::string_view> strings;
Thread::LockGuard lock(lock_);
Encoding::decodeTokens(
Expand Down Expand Up @@ -451,8 +449,8 @@ StatNameStorage::StatNameStorage(absl::string_view name, SymbolTable& table)
: StatNameStorageBase(table.encode(name)) {}

StatNameStorage::StatNameStorage(StatName src, SymbolTable& table) {
const uint64_t size = src.size();
MemBlockBuilder<uint8_t> storage(size);
const size_t size = src.size();
MemBlockBuilder<uint8_t> storage(size); // Note: MemBlockBuilder takes uint64_t.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In theory MemBlockBuilder could be updated to use size_t, too, and I suspect there may be other places we could switch over as well. My feeling is that that's a bit outside the scope of this change, however.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

agreed; outside the scope for now.

src.copyToMemBlock(storage);
setBytes(storage.release());
table.incRefCount(statName());
Expand All @@ -472,11 +470,11 @@ SymbolTable::StoragePtr SymbolTableImpl::makeDynamicStorage(absl::string_view na

// payload_bytes is the total number of bytes needed to represent the
// characters in name, plus their encoded size, plus the literal indicator.
const uint64_t payload_bytes = SymbolTableImpl::Encoding::totalSizeBytes(name.size()) + 1;
const size_t payload_bytes = SymbolTableImpl::Encoding::totalSizeBytes(name.size()) + 1;

// total_bytes includes the payload_bytes, plus the LiteralStringIndicator, and
// the length of those.
const uint64_t total_bytes = SymbolTableImpl::Encoding::totalSizeBytes(payload_bytes);
const size_t total_bytes = SymbolTableImpl::Encoding::totalSizeBytes(payload_bytes);
MemBlockBuilder<uint8_t> mem_block(total_bytes);

SymbolTableImpl::Encoding::appendEncoding(payload_bytes, mem_block);
Expand Down Expand Up @@ -550,7 +548,7 @@ void StatNameStorageSet::free(SymbolTable& symbol_table) {
}

SymbolTable::StoragePtr SymbolTableImpl::join(const StatNameVec& stat_names) const {
uint64_t num_bytes = 0;
size_t num_bytes = 0;
for (StatName stat_name : stat_names) {
if (!stat_name.empty()) {
num_bytes += stat_name.dataSize();
Expand Down
29 changes: 15 additions & 14 deletions source/common/stats/symbol_table_impl.h
Original file line number Diff line number Diff line change
Expand Up @@ -93,7 +93,7 @@ class SymbolTableImpl : public SymbolTable {
/**
* Decodes a uint8_t array into a SymbolVec.
*/
static SymbolVec decodeSymbols(const SymbolTable::Storage array, uint64_t size);
static SymbolVec decodeSymbols(const SymbolTable::Storage array, size_t size);

/**
* Decodes a uint8_t array into a sequence of symbols and literal strings.
Expand All @@ -106,15 +106,15 @@ class SymbolTableImpl : public SymbolTable {
* @param symbolTokenFn a function to be called whenever a symbol is encountered in the array.
* @param stringVIewTokeNFn a function to be called whenever a string literal is encountered.
*/
static void decodeTokens(const SymbolTable::Storage array, uint64_t size,
static void decodeTokens(const SymbolTable::Storage array, size_t size,
const std::function<void(Symbol)>& symbolTokenFn,
const std::function<void(absl::string_view)>& stringViewTokenFn);

/**
* Returns the number of bytes required to represent StatName as a uint8_t
* array, including the encoded size.
*/
uint64_t bytesRequired() const {
size_t bytesRequired() const {
return data_bytes_required_ + encodingSizeBytes(data_bytes_required_);
}

Expand All @@ -130,13 +130,13 @@ class SymbolTableImpl : public SymbolTable {
* @param number A number to encode in a variable length byte-array.
* @return The number of bytes it would take to encode the number.
*/
static uint64_t encodingSizeBytes(uint64_t number);
static size_t encodingSizeBytes(uint64_t number);

/**
* @param num_data_bytes The number of bytes in a data-block.
* @return The total number of bytes required for the data-block and its encoded size.
*/
static uint64_t totalSizeBytes(uint64_t num_data_bytes) {
static size_t totalSizeBytes(size_t num_data_bytes) {
return encodingSizeBytes(num_data_bytes) + num_data_bytes;
}

Expand Down Expand Up @@ -167,10 +167,10 @@ class SymbolTableImpl : public SymbolTable {
* @param The encoded byte array, written previously by appendEncoding.
* @return A pair containing the decoded number, and the number of bytes consumed from encoding.
*/
static std::pair<uint64_t, uint64_t> decodeNumber(const uint8_t* encoding);
static std::pair<uint64_t, size_t> decodeNumber(const uint8_t* encoding);

private:
uint64_t data_bytes_required_{0};
size_t data_bytes_required_{0};
MemBlockBuilder<uint8_t> mem_block_;
};

Expand Down Expand Up @@ -229,7 +229,7 @@ class SymbolTableImpl : public SymbolTable {
* @param size the size of the array in bytes.
* @return std::string the retrieved stat name.
*/
std::vector<absl::string_view> decodeStrings(const Storage array, uint64_t size) const;
std::vector<absl::string_view> decodeStrings(const Storage array, size_t size) const;

/**
* Convenience function for encode(), symbolizing one string segment at a time.
Expand Down Expand Up @@ -403,16 +403,16 @@ class StatName {
bool operator!=(const StatName& rhs) const { return !(*this == rhs); }

/**
* @return uint64_t the number of bytes in the symbol array, excluding the
* overhead for the size itself.
* @return size_t the number of bytes in the symbol array, excluding the
* overhead for the size itself.
*/
uint64_t dataSize() const;
size_t dataSize() const;

/**
* @return uint64_t the number of bytes in the symbol array, including the
* @return size_t the number of bytes in the symbol array, including the
* overhead for the size itself.
*/
uint64_t size() const { return SymbolTableImpl::Encoding::totalSizeBytes(dataSize()); }
size_t size() const { return SymbolTableImpl::Encoding::totalSizeBytes(dataSize()); }

/**
* Copies the entire StatName representation into a MemBlockBuilder, including
Expand Down Expand Up @@ -466,7 +466,8 @@ class StatName {
* hasher and comparator.
*/
absl::string_view dataAsStringView() const {
return {reinterpret_cast<const char*>(data()), dataSize()};
return {reinterpret_cast<const char*>(data()),
static_cast<absl::string_view::size_type>(dataSize())};
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't the underlying problem the abuse of uint64_t in line 409 above for dataSize()?

There's a significant penalty for using a 64 bit int in 32 bit contexts.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't mind changing the underlying type of dataSize, but I don't really have any context on the original choice.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Personally I'd prefer size_t for this; IIUC Envoy style has veered toward using uint64_t. @mattklein123 may have more context.

I'm fine with either:

  • this change, which makes sense in the context of returning a string_view
  • changing dataSize() et al to return size_t, which makes sense generally IMO

I slightly prefer the latter but I'm OK with this. @mattklein123 what do you prefer?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure we can use size_t. Typically we have used explicit types for things that can wrap, but for something like this size_t seems fine.

/wait

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

uint64_t seems to be used extensively here to represent a number of bytes. dataSize() is also called in a number of locations. Just to clarify, is the proposal to switch size_t for all of these cases?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes size_t seems appropriate for number of bytes.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Despite updating the type used for representing byte counts throughout the file, I left this cast in; I don't know what sort of contract absl::string_view::size_type has with respect to size_t.

If folks feel it's redundant though now, I can remove it.

}

const uint8_t* size_and_data_{nullptr};
Expand Down