Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix RawHeaders for POST #29

Merged
merged 3 commits into from
Mar 3, 2018

Conversation

illotum
Copy link

@illotum illotum commented Feb 8, 2018

I'm sorry for the hassle, but apparently fhttp has fast path header parser for HEAD and GET and rawHeaders isn't properly set for other methods. This PR is a fix to #27.

  • L1623 now assigns rawHeaders to a proper sub-slice for methods with possible body.
  • parseHeaders() wasn't returning proper "length parsed", this is fixed by adding internal accumulator in headerScanner to properly return parsed length.
  • added tests to cover POST and HTTP1.0 cases.

Admittedly this affects global hot path, but any difference is lost in noise:

# 10 runs of each
name                       old time/op  new time/op  delta
ServerGet10KReqPerConn-8    627ns ± 1%   635ns ± 2%  +1.33%  (p=0.026 n=9+10)
ServerPost10KReqPerConn-8   702ns ± 9%   699ns ± 6%    ~     (p=0.825 n=10+9)

Intuitively, one int counter doesn't change much.

@@ -129,16 +127,54 @@ func TestRequestRawHeaders(t *testing.T) {
if string(h.Host()) != "foobar" {
t.Fatalf("unexpected host: %q. Expecting %q", h.Host(), "foobar")
}
v1 := h.Peek("EmptyValue")
if len(v1) > 0 {
t.Fatalf("expecting empty value. Got %q", v1)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why the need to remove EmptyValue testing?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

EmptyValue check is not the purpose if this Test. It has been copied alongside other checks (by me, in #27) from a general headers tests that remains unchanged.

In other words EmptyValue is tested elsewhere, the removal is my attempt to DRY my previous PR.

@illotum
Copy link
Author

illotum commented Feb 27, 2018

I'm observing odd behaviour with non-trivial binary bodies in my fork, where rawHeaders is taken right from the body.

Please consider this PR work in progress until I'm able to identify and fix the root cause of this issue.

header.go Outdated
@@ -1620,6 +1620,7 @@ func (h *RequestHeader) parse(buf []byte) (int, error) {
if err != nil {
return 0, err
}
h.rawHeaders = buf[m : m+n]
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this is going to work. parse is called from tryRead where it's buffer argument comes from mustPeekBuffered which as you can see gets its return value from bufio.Reader.Peek which will change its return value after a next read (for example for the body) making rawHeaders contain invalid data.

I haven't tested this myself but I see your tests don't cover this case. Can you add a test here a whole POST request including a body of at least 4kb is parsed instead of just the RequestHeader.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! I have just followed the same trail literally minutes ago, will add a test shortly.

@illotum
Copy link
Author

illotum commented Feb 27, 2018

While this fixed the issue, the performance implications are unfortunately not trivial:

name                       old time/op  new time/op  delta
ServerGet10KReqPerConn-8    286ns ± 7%   282ns ±13%     ~     (p=0.424 n=10+10)
ServerPost10KReqPerConn-8   331ns ±14%   390ns ±10%  +17.94%  (p=0.000 n=10+10)

@erikdubbelboer For my particular use case this loss is well worth it, but I understand if you would like to postpone the merge.

Ideas how we can work around Peek's sliding window for this particular code path are definitely welcome.

@erikdubbelboer
Copy link
Owner

How are you benchmarking this? I wouldn't expect a big difference seeing as rawHeaders is reused so it only needs to copy the raw header bytes.

@illotum
Copy link
Author

illotum commented Feb 27, 2018

Fair point! I'm comparing

go test -bench='BenchmarkServer(Post|Get)10KReq' -count=10

between my branch and your master. Laptop unfortunately.


Here's a rerun, this time with lower difference.

name                       old time/op  new time/op  delta
ServerPost10KReqPerConn-8   319ns ± 3%   349ns ±13%  +9.60%  (p=0.000 n=10+10)

@erikdubbelboer
Copy link
Owner

Could it be that you're mixing up old an new? On my benchmarks the new code always seems to be faster:

benchmark                              old ns/op     new ns/op     delta
BenchmarkServerGet10KReqPerConn-8      278           263           -5.40%
BenchmarkServerPost10KReqPerConn-8     350           329           -6.00%

benchmark                              old allocs     new allocs     delta
BenchmarkServerGet10KReqPerConn-8      0              0              +0.00%
BenchmarkServerPost10KReqPerConn-8     0              0              +0.00%

benchmark                              old bytes     new bytes     delta
BenchmarkServerGet10KReqPerConn-8      0             0             +0.00%
BenchmarkServerPost10KReqPerConn-8     0             0             +0.00%

I'm using Go version go1.10 darwin/amd64.

@illotum
Copy link
Author

illotum commented Mar 1, 2018

Funny. No, I'm not mixing it, but this is also not the first time laptop benchmarks fail me. It's Go v1.10 on darwin as well.

I'll post stats from linux/amd64 @ ec2 later today.

@illotum
Copy link
Author

illotum commented Mar 2, 2018

~> benchstat --geomean fhttp postfix  # 10 runs, linux/amd4 @ c4.8xlarge
name                        old time/op  new time/op  delta
ServerGet10KReqPerConn-36    610ns ± 8%   624ns ± 9%    ~     (p=0.289 n=10+10)
ServerPost10KReqPerConn-36   654ns ±12%   637ns ± 9%    ~     (p=0.403 n=10+10)
[Geo mean]                   632ns        630ns       -0.29%

I'm getting convinced any real difference will be noticeable only on the most stable setups. I feel comfortable merging it at this point.

@erikdubbelboer
Copy link
Owner

On my Ubuntu server with 1.10 the new code is still faster.

I also think it’s ready but just to be sure I just put it live on one of our servers to test. If nothing strange happens I’ll merge it tomorrow.

@erikdubbelboer erikdubbelboer merged commit 34f277f into erikdubbelboer:master Mar 3, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants