Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Don't panic when a given field value is nil #675

Merged
merged 1 commit into from
Mar 27, 2019
Merged

Conversation

joa
Copy link
Contributor

@joa joa commented Jan 30, 2019

Ideally a logging framework doesn't cause a panic for code like this:

foo, err := getFoo()

if err != nil {
   log.Error("Couldn't acquire foo", zap.Error(err))
}

// foo could be nil here
log = log.With(zap.Stringer("foo", foo))

bar, err := getBar()

if err != nil {
  // panics
  log.Error("Couldn't acquire bar", zap.Error(err))
}

This is probably also fixing #495 and #528

@CLAassistant
Copy link

CLAassistant commented Jan 30, 2019

CLA assistant check
All committers have signed the CLA.

@codecov
Copy link

codecov bot commented Jan 30, 2019

Codecov Report

Merging #675 into master will increase coverage by <.01%.
The diff coverage is 100%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #675      +/-   ##
==========================================
+ Coverage   97.38%   97.39%   +<.01%     
==========================================
  Files          40       40              
  Lines        2102     2108       +6     
==========================================
+ Hits         2047     2053       +6     
  Misses         47       47              
  Partials        8        8
Impacted Files Coverage Δ
zapcore/field.go 100% <100%> (ø) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update d2a364d...f8d5286. Read the comment docs.

@prashantv
Copy link
Collaborator

Thanks for the contribution @joa.

Your solution works when dealing with functions that return interfaces, and explicitly return nil. However, it doesn't work when the return value has a type, but the value is nil, as mentioned in #495 (comment)

With your PR merged in, I wrote a simple test:

func getURL() (*url.URL, error) {
  return nil, nil
}

url, err := getURL()
if err != nil {
  logger.Error("got error", zap.Error(err))
}

logger.Info("got url", zap.Stringer("url", url))

This still causes a panic:

panic: runtime error: invalid memory address or nil pointer dereference [recovered]
        panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x8 pc=0x5d3550]

(truncated)

panic(0x74d580, 0x9bc7d0)
        /home/prashant/bin/go1.10.4/src/runtime/panic.go:502 +0x229
net/url.(*URL).String(0x0, 0x790a00, 0x0)
        /home/prashant/bin/go1.10.4/src/net/url/url.go:742 +0x40
go.uber.org/zap/zapcore.Field.AddTo(0x7b73a5, 0x3, 0x18, 0x0, 0x0, 0x0, 0x790a00, 0x0, 0x81ed20, 0xc42009d170)
        /home/prashant/go/src/go.uber.org/zap/zapcore/field.go:172 +0xc10

See related questions:
https://golang.org/doc/faq#nil_error
https://stackoverflow.com/questions/38070015/check-if-interface-value-is-nil-in-go-without-using-reflect/38070426

@joa joa force-pushed the master branch 3 times, most recently from 619ba17 to a78f13d Compare March 5, 2019 17:25
@joa
Copy link
Contributor Author

joa commented Mar 5, 2019

Good catch. I expanded the test case and added the additional nil check.

@prashantv
Copy link
Collaborator

Unfortunately reflection comes with a performance penalty, and the only alternative is unsafe, which we strongly discourage.

It's also not correct to assume that nil always means that the type cannot be logged. E.g., it's completely reasonable to have a type which has a String() method that handles the nil case.

@joa
Copy link
Contributor Author

joa commented Mar 6, 2019

@prashantv The commit does not use reflection to test for the nil pointer in side the non-nil (type, ptr) tuple. Do you have a particular reason why you'd advise against unsafe in this instance? There are unit tests in place that cover both the positive and negative cases. Of course the compatibility promise isn't given when using unsafe but that's why the tests exist.

I understand the argument around a valid String() implementation for the nil case. In fact it is best practice to provide meaningful nil values and have these cases covered. However as with url.URL we see this isn't even consistent in within the go lib itself.

Of course both cases could be supported with recover and accepting a potential panic. I guess however I'd rather have a different layer catch these (e.g. a middleware in a server). I guess there is nothing left in this MR to improve directly. Feel free to close if you think the risks outweigh the benefits.

@joa
Copy link
Contributor Author

joa commented Mar 6, 2019

I see from your comment in #495 (comment) that fmt is acutally doing this. Maybe that'd be an acceptable solution.

@joa
Copy link
Contributor Author

joa commented Mar 6, 2019

With defer/recover

BenchmarkAddingFields/Zap-8                                  	 1000000	      2212 ns/op	    2214 B/op	      15 allocs/op
BenchmarkAddingFields/Zap.Check-8                            	 1000000	      2221 ns/op	    2214 B/op	      15 allocs/op
BenchmarkAddingFields/Zap.CheckSampled-8                     	 5000000	       312 ns/op	     277 B/op	       1 allocs/op
BenchmarkAddingFields/Zap.Sugar-8                            	  500000	      2762 ns/op	    3010 B/op	      21 allocs/op
BenchmarkAddingFields/apex/log-8                             	   50000	     33792 ns/op	    6694 B/op	     115 allocs/op
BenchmarkAddingFields/go-kit/kit/log-8                       	  200000	      8049 ns/op	    5641 B/op	     105 allocs/op
BenchmarkAddingFields/inconshreveable/log15-8                	   50000	     30263 ns/op	    6625 B/op	     122 allocs/op
BenchmarkAddingFields/sirupsen/logrus-8                      	  200000	      9931 ns/op	    9082 B/op	     122 allocs/op
BenchmarkAddingFields/go.pedge.io/lion-8                     	  200000	      8117 ns/op	    8761 B/op	     105 allocs/op
BenchmarkAddingFields/rs/zerolog-8                           	  200000	      6691 ns/op	    4725 B/op	      76 allocs/op
BenchmarkAddingFields/rs/zerolog.Check-8                     	  200000	      6646 ns/op	    4728 B/op	      76 allocs/op

Without defer/recover:

BenchmarkAddingFields/Zap-8                                  	 1000000	      1614 ns/op	     770 B/op	       5 allocs/op
BenchmarkAddingFields/Zap.Check-8                            	 1000000	      1614 ns/op	     770 B/op	       5 allocs/op
BenchmarkAddingFields/Zap.CheckSampled-8                     	10000000	       225 ns/op	      91 B/op	       0 allocs/op
BenchmarkAddingFields/Zap.Sugar-8                            	 1000000	      1986 ns/op	    1565 B/op	      11 allocs/op
BenchmarkAddingFields/apex/log-8                             	   50000	     33770 ns/op	    6694 B/op	     115 allocs/op
BenchmarkAddingFields/go-kit/kit/log-8                       	  200000	      8100 ns/op	    5640 B/op	     105 allocs/op
BenchmarkAddingFields/inconshreveable/log15-8                	   50000	     30802 ns/op	    6623 B/op	     122 allocs/op
BenchmarkAddingFields/sirupsen/logrus-8                      	  200000	     10017 ns/op	    9082 B/op	     122 allocs/op
BenchmarkAddingFields/go.pedge.io/lion-8                     	  200000	      8220 ns/op	    8760 B/op	     105 allocs/op
BenchmarkAddingFields/rs/zerolog-8                           	  200000	      6620 ns/op	    4728 B/op	      76 allocs/op
BenchmarkAddingFields/rs/zerolog.Check-8                     	  200000	      6662 ns/op	    4725 B/op	      76 allocs/op

@joa
Copy link
Contributor Author

joa commented Mar 6, 2019

Implemented recover within defer and expanded test cases as discussed (e.g. valid nil impl for Stringer).

@prashantv
Copy link
Collaborator

Thanks for adding the benchmarks @joa and some responses to your questions:

@prashantv The commit does not use reflection to test for the nil pointer in side the non-nil (type, ptr) tuple. Do you have a particular reason why you'd advise against unsafe in this instance? There are unit tests in place that cover both the positive and negative cases. Of course the compatibility promise isn't given when using unsafe but that's why the tests exist.

From the unsafe package documentation:

Packages that import unsafe may be non-portable and are not protected by the Go 1 compatibility guidelines.

Even if the test covers it, a client may upgrade to a new Go version, and zap may stop working in that version, and will require a fix + library upgrades. I'd much rather avoid the concerns that crop up with unsafe.

Of course both cases could be supported with recover and accepting a potential panic. I guess however I'd rather have a different layer catch these (e.g. a middleware in a server). I guess there is nothing left in this MR to improve directly. Feel free to close if you think the risks outweigh the benefits.

Yep, the recover approach works well, but also has a measurable performance impact. Given that this could be better handled elsewhere, I'm not sure the benefits outweigh the cost.

However, I do want to consider all the options, since this has come up multiple times. We could:

  • Do what you had originally: only check if value == nil, which works if you return a nil type/value interface, but not otherwise.
  • (In zap or externally) Add a SafeStringer function which will wrap a stringer, and add panic handling. Field constructors can be done outside of zap, and it would probably look like:
type safeStringer struct{
  s fmt.Stringer
}

func (ss safeStringer) String() {
  defer func() {
    // recover and return "nil"
  }()
  return ss.s.String()
}

func SafeStringer(s fmt.Stringer) fmt.Stringer {
  return safeStringer{s}
}

@joa
Copy link
Contributor Author

joa commented Mar 7, 2019

Agree.

There is another option. If you're just interested in the Stringer case the performance impact could be reduce to just that. And I think it would be the best solution since we don't expect panic in any of the other cases (e.g. IntField) and should not pay the recover tax there. Introducing a new SafeXXX kind of field is also weird. Because why would I want to use the unsafe variant?

ArrayMarshaler and ObjectMarshaler are defined by Zap. It would be okay IMHO to make the requirement that a meaningful nil implementation must exist.

Note that these benchmark numbers are to be taken with a grain of salt. BenchmarkAddingFields doesn't use zap.Stringer. This just shows you that the performance impact is zero when not using zap.Stringer.

BenchmarkAddingFields/Zap-8                                      1000000              1699 ns/op             770 B/op          5 allocs/op
BenchmarkAddingFields/Zap.Check-8                                1000000              1705 ns/op             770 B/op          5 allocs/op
BenchmarkAddingFields/Zap.CheckSampled-8                         5000000               236 ns/op              91 B/op          0 allocs/op
BenchmarkAddingFields/Zap.Sugar-8                                1000000              2058 ns/op            1565 B/op         11 allocs/op
BenchmarkAddingFields/apex/log-8                                   50000             34314 ns/op            6694 B/op        115 allocs/op
BenchmarkAddingFields/go-kit/kit/log-8                            200000              8087 ns/op            5640 B/op        105 allocs/op
BenchmarkAddingFields/inconshreveable/log15-8                      50000             30024 ns/op            6625 B/op        122 allocs/op
BenchmarkAddingFields/sirupsen/logrus-8                           200000              9941 ns/op            9082 B/op        122 allocs/op
BenchmarkAddingFields/go.pedge.io/lion-8                          200000              8358 ns/op            8758 B/op        105 allocs/op
BenchmarkAddingFields/rs/zerolog-8                                200000              6914 ns/op            4725 B/op         76 allocs/op
BenchmarkAddingFields/rs/zerolog.Check-8                          200000              7137 ns/op            4724 B/op         76 allocs/op

@prashantv
Copy link
Collaborator

Only doing defer for the Stringer case is a reasonable option, but what is the performance impact to logging Stringer fields when using defer?

If it's negligible, then it definitely makes sense to do by default. Otherwise, the "safe" field constructors could provide an alternative between safety and performance.

@joa
Copy link
Contributor Author

joa commented Mar 25, 2019

@prashantv: Sorry for the late reply but here are the benchmarks

Benchmark using defer for fmt.Stringer

BenchmarkAddingFields/Zap-8                                      1000000              1991 ns/op            1004 B/op          8 allocs/op
BenchmarkAddingFields/Zap.Check-8                                1000000              1887 ns/op            1004 B/op          8 allocs/op
BenchmarkAddingFields/Zap.CheckSampled-8                         5000000               267 ns/op             122 B/op          0 allocs/op
BenchmarkAddingFields/Zap.Sugar-8                                1000000              2475 ns/op            1865 B/op         14 allocs/op
BenchmarkAddingFields/apex/log-8                                   50000             34326 ns/op            6691 B/op        115 allocs/op
BenchmarkAddingFields/go-kit/kit/log-8                            200000              8357 ns/op            5855 B/op        110 allocs/op
BenchmarkAddingFields/inconshreveable/log15-8                      50000             32977 ns/op            8114 B/op        128 allocs/op
BenchmarkAddingFields/sirupsen/logrus-8                           200000             10069 ns/op            9088 B/op        122 allocs/op
BenchmarkAddingFields/go.pedge.io/lion-8                          200000              9393 ns/op            8688 B/op        105 allocs/op
BenchmarkAddingFields/rs/zerolog-8                                200000              7106 ns/op            4729 B/op         76 allocs/op
BenchmarkAddingFields/rs/zerolog.Check-8                          200000              7171 ns/op            4725 B/op         76 allocs/op

Baseline

BenchmarkAddingFields/Zap-8                                      1000000              1927 ns/op             875 B/op          7 allocs/op
BenchmarkAddingFields/Zap.Check-8                                1000000              1813 ns/op             875 B/op          7 allocs/op
BenchmarkAddingFields/Zap.CheckSampled-8                         5000000               258 ns/op             105 B/op          0 allocs/op
BenchmarkAddingFields/Zap.Sugar-8                                1000000              2316 ns/op            1736 B/op         13 allocs/op
BenchmarkAddingFields/apex/log-8                                   30000             35845 ns/op            6695 B/op        115 allocs/op
BenchmarkAddingFields/go-kit/kit/log-8                            200000              8246 ns/op            5856 B/op        110 allocs/op
BenchmarkAddingFields/inconshreveable/log15-8                      50000             33425 ns/op            8110 B/op        128 allocs/op
BenchmarkAddingFields/sirupsen/logrus-8                           200000              9958 ns/op            9088 B/op        122 allocs/op
BenchmarkAddingFields/go.pedge.io/lion-8                          200000              8169 ns/op            8696 B/op        105 allocs/op
BenchmarkAddingFields/rs/zerolog-8                                200000              7984 ns/op            4724 B/op         76 allocs/op
BenchmarkAddingFields/rs/zerolog.Check-8                          200000              7372 ns/op            4724 B/op         76 allocs/op

I applied the following changes to the benchmark functions:

diff --git a/benchmarks/zap_test.go b/benchmarks/zap_test.go
index 92f7120..aa503bc 100644
--- a/benchmarks/zap_test.go
+++ b/benchmarks/zap_test.go
@@ -23,6 +23,7 @@ package benchmarks
 import (
        "errors"
        "fmt"
+       "net/url"
        "time"
 
        "go.uber.org/multierr"
@@ -37,6 +38,7 @@ var (
        _messages   = fakeMessages(1000)
        _tenInts    = []int{1, 2, 3, 4, 5, 6, 7, 8, 9, 0}
        _tenStrings = []string{"a", "b", "c", "d", "e", "f", "g", "h", "i", "j"}
+       _oneStringer, _ = url.Parse("http://localhost:1234")
        _tenTimes   = []time.Time{
                time.Unix(0, 0),
                time.Unix(1, 0),
@@ -130,6 +132,7 @@ func fakeFields() []zap.Field {
                zap.Ints("ints", _tenInts),
                zap.String("string", _tenStrings[0]),
                zap.Strings("strings", _tenStrings),
+               zap.Stringer("stringer", _oneStringer),
                zap.Time("time", _tenTimes[0]),
                zap.Times("times", _tenTimes),
                zap.Object("user1", _oneUser),
@@ -145,6 +148,7 @@ func fakeSugarFields() []interface{} {
                "ints", _tenInts,
                "string", _tenStrings[0],
                "strings", _tenStrings,
+               "stringer", _oneStringer,
                "time", _tenTimes[0],
                "times", _tenTimes,
                "user1", _oneUser,
@@ -162,6 +166,7 @@ func fakeFmtArgs() []interface{} {
                _tenInts,
                _tenStrings[0],
                _tenStrings,
+               _oneStringer,
                _tenTimes[0],
                _tenTimes,
                _oneUser,

In conclusion: the performance impact when using defer with fmt.Stringer is neglectable and the number of additional allocations (one) scales linear with the amount of zap.Stringer fields.

@prashantv
Copy link
Collaborator

Thanks @joa. To better understand the performance impact, I wrote a benchmark which:

  • Uses a production logger config, writing to /dev/null
  • Created a custom stringer type which doesn't need to encode anything

E.g.,

type customString string

func (cs customString) String() string {
        return string(cs)
}

func BenchmarkStringer(b *testing.B) {
        cfg := zap.NewProductionConfig()
        cfg.OutputPaths = []string{"/dev/null"}
        l, err := cfg.Build()
        require.NoError(b, err, "failed to build logger")

        cs := customString("encode")

        b.Run("1 field", func(b *testing.B) {
                for i := 0; i < b.N; i++ {
                        l.Info("test",
                                zap.Stringer("cs1", cs),
                        )
                }
        })
        b.Run("4 fields", func(b *testing.B) {
                for i := 0; i < b.N; i++ {
                        l.Info("test",
                                zap.Stringer("cs1", cs),
                                zap.Stringer("cs2", cs),
                                zap.Stringer("cs3", cs),
                                zap.Stringer("cs4", cs),
                        )
                }
        })
}

To get a closer look at the impact to just the stringer performance, I wrote a benchmark which only logs stringer fields,

        b.Run("1 field", func(b *testing.B) {
                for i := 0; i < b.N; i++ {
                        l.Info("test",
                                zap.Stringer("cs1", cs),
                        )
                }
        })
        b.Run("4 fields", func(b *testing.B) {
                for i := 0; i < b.N; i++ {
                        l.Info("test",
                                zap.Stringer("cs1", cs),
                                zap.Stringer("cs2", cs),
                                zap.Stringer("cs3", cs),
                                zap.Stringer("cs4", cs),
                        )
                }
        })

I then compared the difference with and without this change on Go 1.12,

Without this change (no defer):

BenchmarkStringer/1_field-4             20000000               203 ns/op              82 B/op          2 allocs/op
BenchmarkStringer/4_fields-4            20000000               373 ns/op             322 B/op          5 allocs/op

With this change (with defer):

BenchmarkStringer/1_field-4             20000000               207 ns/op              83 B/op          2 allocs/op
BenchmarkStringer/4_fields-4            10000000               377 ns/op             327 B/op          5 allocs/op

There's a very slight increase in the number of bytes allocated, but the actual number of allocations remains the same, and the performance is the same (the difference looks like noise).

Copy link
Collaborator

@prashantv prashantv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@abhinav Can you take a look as well?

zapcore/field.go Outdated Show resolved Hide resolved
zapcore/field.go Outdated Show resolved Hide resolved
zapcore/field_test.go Show resolved Hide resolved
@joa joa force-pushed the master branch 3 times, most recently from 18a1593 to 38205dc Compare March 26, 2019 08:45
Copy link
Collaborator

@abhinav abhinav left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the contribution, @joa! This looks good to me.

zapcore/field.go Outdated Show resolved Hide resolved
zapcore/field_test.go Show resolved Hide resolved
Copy link
Collaborator

@prashantv prashantv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, couple of minor comments

zapcore/field.go Outdated Show resolved Hide resolved
zapcore/field.go Outdated Show resolved Hide resolved
zapcore/field.go Outdated Show resolved Hide resolved
zapcore/field_test.go Show resolved Hide resolved
zapcore/field.go Outdated Show resolved Hide resolved
@joa joa force-pushed the master branch 2 times, most recently from f204cf0 to 74cbeb7 Compare March 27, 2019 09:10
@joa
Copy link
Contributor Author

joa commented Mar 27, 2019

Thank you for the review @prashantv and @abhinav. This MR became much more concise thanks to your feedback.

This commit ensures any panic raised while encoding zap.Stringer
is catched and reported as an error of the form kError
for a field k. The reported panic is prefixed with "PANIC=" to
distinguish these errors from other encoding errors.

Closes uber-go#495
Closes uber-go#528
Copy link
Collaborator

@abhinav abhinav left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice. Relying on fmt for the panic-in-panic-handling case simplifies things significantly.

Copy link
Collaborator

@prashantv prashantv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks great @joa, thanks for being patient with the back-and-forth reviews and addressing all our comments.

This has been a long-standing request, so I'm glad we're finally merging a fix!

@prashantv prashantv merged commit badef73 into uber-go:master Mar 27, 2019
@joa
Copy link
Contributor Author

joa commented May 2, 2019

Might want to revisit in Go 1.13: https://go-review.googlesource.com/c/go/+/171758/

cgxxv pushed a commit to cgxxv/zap that referenced this pull request Mar 25, 2022
This commit ensures any panic raised while encoding zap.Stringer
is caught and reported as an error as field kError
for a field k. The reported panic is prefixed with "PANIC=" to
distinguish these errors from other encoding errors.

Closes uber-go#495
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

4 participants