Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AddBinary(data) result differs from the result of AddString(string(data)) #324

Closed
skipor opened this issue Feb 19, 2017 · 2 comments
Closed
Labels

Comments

@skipor
Copy link
Contributor

skipor commented Feb 19, 2017

In Go []byte and string is freely converted between. But binary data is encoded as base64 string now. It is very surprising, and seem to be uncommon when zap.Binary(key, data) and zap.String(key, string(zap)) produce different result.

I think that it is caller responsibility to encode binary data, and it should can choose encoding itself.
Also, this strange behaviour requires extra allocations for callers that don't want to convert their []byte to base64.

@skipor
Copy link
Contributor Author

skipor commented Feb 19, 2017

@akshayjshah

AddBinary won't work here - it's intended for opaque binary blobs (for example, you fail to unmarshal some proto3 blob and want to log it for later debugging). In JSON, binary is base64-encoded, so it won't be a readable stacktrace.
In general, I really don't want to let users push arbitrary bytes into the encoders. It's too difficult to make it encoding-agnostic.

But users can push arbitrary bytes anyway, by wrapping them to string. So that is strange constraint. Users can do same, but encoder should make extra allocations.
zap.Binary for opaque binary blobs, okay. But what about add zap.StringBytes(string, []byte) Field, ObjectEncoder.AddStringBytes(key, []byte) and PrimitiveArrayEncoder.AppendStringBytes that accepts binary data, but which should be considered as string.
PrimitiveArrayEncoder.AppendStringBytes allows to get rid of string allocation for adding base64 encoded data. Or any other adding based on bufferpool.Buffer.

@akshayjshah
Copy link
Contributor

There are a few issues at play.

First, base64-encoding. This is specific to the JSON encoder, since that's the usual way to represent binary data in JSON. When/if we get around to adding a msgpack encoder, it won't need to base64-encode its input, since msgpack is a binary encoding scheme.

Second, whether we want an AddByteString and AppendByteString API. I'm cautiously open to this, but there are some places where we'll need to take care:

  • We'll have to assume that the bytes are UTF-8 encoded. Not ideal, but it's what we do for strings anyways.
  • For the JSON encoder, we'll need to escape the bytes properly. That requires replicating something like the logic in bytes.Buffer.ReadRune and handling each code point correctly. This will need fairly extensive testing, including some tests using testing/quick - use the existing string-escaping tests as a reference.

I'm certainly willing to look at a PR for this, but (unfortunately) it'll need to come very quickly - adding to the encoder APIs is a breaking change that I won't accept after version 1.0, which I'm hoping to cut next week.

skipor added a commit to skipor/zap that referenced this issue Mar 12, 2017
This is a breaking change that adds a `AddByteString` to `zapcore.ObjectEncoder`, `AppendByteString` to `zapcore.PrimitiveArrayEncoder`.
This API optimizes logging UTF-8 encoded []byte data.

Fixes uber-go#324.
skipor added a commit to skipor/zap that referenced this issue Mar 12, 2017
This is a breaking change that adds a `AddByteString` to `zapcore.ObjectEncoder`, `AppendByteString` to `zapcore.PrimitiveArrayEncoder`.
This API optimizes logging UTF-8 encoded []byte data.

Fixes uber-go#324.
akshayjshah pushed a commit to skipor/zap that referenced this issue Mar 14, 2017
This is a breaking change that adds a `AddByteString` to `zapcore.ObjectEncoder`, `AppendByteString` to `zapcore.PrimitiveArrayEncoder`.
This API optimizes logging UTF-8 encoded []byte data.

Fixes uber-go#324.
akshayjshah pushed a commit to skipor/zap that referenced this issue Mar 14, 2017
This is a breaking change that adds a `AddByteString` to `zapcore.ObjectEncoder`, `AppendByteString` to `zapcore.PrimitiveArrayEncoder`.
This API optimizes logging UTF-8 encoded []byte data.

Fixes uber-go#324.
akshayjshah pushed a commit to skipor/zap that referenced this issue Mar 14, 2017
This is a breaking change that adds a `AddByteString` to `zapcore.ObjectEncoder`, `AppendByteString` to `zapcore.PrimitiveArrayEncoder`.
This API optimizes logging UTF-8 encoded []byte data.

Fixes uber-go#324.
akshayjshah pushed a commit to skipor/zap that referenced this issue Mar 14, 2017
This is a breaking change that adds a `AddByteString` to `zapcore.ObjectEncoder`, `AppendByteString` to `zapcore.PrimitiveArrayEncoder`.
This API optimizes logging UTF-8 encoded []byte data.

Fixes uber-go#324.
akshayjshah pushed a commit that referenced this issue Mar 14, 2017
This is a breaking change that adds bytestring APIs (to log UTF-8 encoded bytes) to ObjectEncoder and ArrayEncoder. To actually save allocations along this path, I also benchmarked adding a []byte to the field union; this reduces the allocation count for
bytestrings, but increases the cost of adding fields by ~50%.

Fixes #324.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Development

No branches or pull requests

2 participants