-
Notifications
You must be signed in to change notification settings - Fork 17.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
time: Timer reset broken under heavy use since go1.16 timer optimizations added [1.16 backport] #47332
Comments
Approved. This is a serious issue with no workaround. |
Change https://golang.org/cl/336689 mentions this issue: |
Closed by merging ed8cbbc to release-branch.go1.16. |
… adjustTimers is 0 This avoids a race when a new timerModifiedEarlier timer is created by a different goroutine. For #47329 Fixes #47332 Change-Id: I6f6c87b4a9b5491b201c725c10bc98e23e0ed9d1 Reviewed-on: https://go-review.googlesource.com/c/go/+/336432 Trust: Ian Lance Taylor <iant@golang.org> Run-TryBot: Ian Lance Taylor <iant@golang.org> TryBot-Result: Go Bot <gobot@golang.org> Reviewed-by: Michael Knyszek <mknyszek@google.com> (cherry picked from commit 798ec73) Reviewed-on: https://go-review.googlesource.com/c/go/+/336689
@ianlancetaylor Should the follow up change CL 337309 also be cherry-picked to 1.16? (I'll reopen this so we don't forget to make that decision before release time.) |
I'm uncertain whether CL 337309 should be cherry picked to the 1.16 branch. The bug fix (CLs 336432, 336689) fixed a real bug that could cause timers to fail to fire. CL 337309 is a performance fix. I'm mildly concerned that CL 337309 will introduce some other unforeseen performance problem. On the 1.16 release branch we needed the bug fix, but I don't know how hard we want to change performance issues on that branch. |
Closed per Ian's comment. |
FYI I'm seeing increased CPU usage on various x64 and arm64 linux nodes after rolling out ed8cbbc. I will rollback to 1.16.6 tomorrow to confirm and capture some cpu profiles. EDIT: I confirmed reverting that single commit brought the cpu usage of my otherwise idle golang program from 100% back to 0%. Here's a 60s cpu profile with that commit included:
and back to normal after I reverted it:
|
The CPU usage issue was also observed on macOS and Windows nodes, and was also fixed there after rolling back this commit. |
What can you tell us about the way in which your program uses timers and deadlines? Does the change in https://golang.org/cl/337309 fix the problem? Thanks. |
I will try a build with that CL now. I'm testing on an arm64 rpi4 which consistently shows 100% cpu usage with the bad commit. My program has a number of goroutines which run infinite loops as so:
|
CPU usage is normal with https://golang.org/cl/337309 applied. |
Thanks for testing. I will backport 337309 to the 1.16 branch. |
Change https://golang.org/cl/338649 mentions this issue: |
Backport complete. |
In CL 336432 we changed adjusttimers so that it no longer cleared timerModifiedEarliest if there were no timersModifiedEarlier timers. This caused some Google internal tests to time out, presumably due to the increased contention on timersLock. We can avoid that by simply not skipping the loop in adjusttimers, which lets us safely clear timerModifiedEarliest. And if we don't skip the loop, then there isn't much reason to keep the count of timerModifiedEarlier timers at all. So remove it. The effect will be that for programs that create some timerModifiedEarlier timers and then remove them all, the program will do an occasional additional loop over all the timers. And, programs that have some timerModifiedEarlier timers will always loop over all the timers, without the quicker exit when they have all been seen. But the loops should not occur all that often, due to timerModifiedEarliest. For #47329 For #47332 Change-Id: I7b244c1244d97b169a3c7fbc8f8a8b115731ddee Reviewed-on: https://go-review.googlesource.com/c/go/+/337309 Trust: Ian Lance Taylor <iant@golang.org> Run-TryBot: Ian Lance Taylor <iant@golang.org> TryBot-Result: Go Bot <gobot@golang.org> Reviewed-by: Michael Pratt <mpratt@google.com> (cherry picked from commit bfbb288) Reviewed-on: https://go-review.googlesource.com/c/go/+/338649
I've been trying to diagnose an issue relating to short sharp CPU spikes which sound similar to @tmm1's comment above. I'm running a http server that configures a read (5 secs), write (10 secs) and idle (15 secs) timeout. I found that when using Go 1.16.6 the CPU usage is stable, however when using 1.16.7 I see several 100% CPU spikes lasting <2 mins over a 24 hour period. I have been unable to correlate the spikes with any other metric. It appears to be random. HTTP traffic ranges from several request per minute to 10,000+. If I remove the |
@lukestoward This issue is closed and as far as we can tell, it is fixed. The fix was in the Go 1.16.7 release. Although the fix may well have introduced the problem you describe, since the patch has been released we should treat this as a new problem. I encourage you to open a new issue, ideally with a case that we can use to reproduce the problem. See also #47762. |
@ianlancetaylor requested issue #47329 to be considered for backport to the next 1.16 minor release.
The text was updated successfully, but these errors were encountered: