Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

runtime: frequent crashes in unlock2 on NetBSD and OpenBSD since 2021-10-07 #49453

Closed
bcmills opened this issue Nov 8, 2021 · 16 comments
Closed
Assignees
Labels
FrozenDueToAge NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. okay-after-beta1 Used by release team to mark a release-blocker issue as okay to resolve either before or after beta1 release-blocker
Milestone

Comments

@bcmills
Copy link
Contributor

bcmills commented Nov 8, 2021

greplogs --dashboard -md -l -e '(?ms)fatal error.*unlock2'

2021-11-08T18:06:16-5e64755/netbsd-386-9_0
2021-11-08T18:06:16-39ade5b-5e64755/netbsd-amd64-9_0
2021-11-08T17:07:45-6635138-7bda349/netbsd-386-9_0
2021-11-08T16:15:01-30b2efe-759eaa2/netbsd-386-9_0
2021-11-08T14:49:56-18b340f-ab31dbc/netbsd-386-9_0
2021-11-07T04:56:11-85493d5/netbsd-386-9_0
2021-11-06T19:41:15-036812b-61d789d/netbsd-386-9_0
2021-11-06T16:43:43-39ade5b-3544082/netbsd-386-9_0
2021-11-06T13:10:06-1f47c86-4f083c7/netbsd-386-9_0
2021-11-06T10:24:44-0c60b7c-f19e400/netbsd-amd64-9_0
2021-11-06T00:29:44-a66bbe2-b74f2ef/netbsd-386-9_0
2021-11-06T00:29:44-a07c284-b74f2ef/netbsd-amd64-9_0
2021-11-06T00:29:44-0c60b7c-b74f2ef/netbsd-amd64-9_0
2021-11-05T23:32:57-b8b8e7f-09e8de7/netbsd-386-9_0
2021-11-05T22:57:08-09e8de7/openbsd-386-68
2021-11-05T22:54:47-ba79c1e/netbsd-386-9_0
2021-11-05T22:54:47-ba79c1e/netbsd-amd64-9_0
2021-11-05T22:30:17-03971e3-b07c41d/netbsd-386-9_0
2021-11-05T22:26:07-4ab7496-d3a80c7/netbsd-386-9_0
2021-11-05T22:26:07-39ade5b-d3a80c7/netbsd-amd64-9_0
2021-11-05T22:00:37-f00b43f/openbsd-386-68
2021-11-05T21:48:25-a66bbe2-6b223e8/netbsd-amd64-9_0
2021-11-05T21:48:25-4ab7496-6b223e8/netbsd-386-9_0
2021-11-05T21:34:10-4ab7496-bb53fd7/openbsd-386-68
2021-11-05T21:29:18-75952ab/openbsd-386-68
2021-11-05T21:28:50-fb8b176/openbsd-386-68
2021-11-05T21:27:34-7aed6dd/netbsd-386-9_0
2021-11-05T21:26:54-3e9e024/openbsd-386-68
2021-11-05T21:13:38-39ade5b-091948a/netbsd-amd64-9_0
2021-11-05T20:59:32-71559a6/openbsd-386-68
2021-11-05T20:06:27-a7b6526-7be227c/netbsd-386-9_0
2021-11-05T20:06:27-a07c284-7be227c/netbsd-386-9_0
2021-11-05T19:48:29-c353f1b/netbsd-386-9_0
2021-11-05T19:14:22-a07c284-fa16efb/netbsd-amd64-9_0
2021-11-05T19:01:13-93bab8a/netbsd-386-9_0
2021-11-05T18:20:07-a66bbe2-53bab19/netbsd-386-9_0
2021-11-05T18:20:07-a07c284-53bab19/netbsd-386-9_0
2021-11-05T17:52:30-df18377/netbsd-amd64-9_0
2021-11-05T17:47:28-6f32d20/netbsd-386-9_0
2021-11-05T17:47:28-6f32d20/netbsd-amd64-9_0
2021-11-05T17:23:06-37951d8/openbsd-386-68
2021-11-05T17:17:30-a66bbe2-62c6ff4/netbsd-amd64-9_0
2021-11-05T17:17:30-62c6ff4/netbsd-amd64-9_0
2021-11-05T17:17:30-39ade5b-62c6ff4/netbsd-386-9_0
2021-11-05T16:54:01-a07c284-3796df1/netbsd-386-9_0
2021-11-05T07:00:05-ce13745-6fefb7f/netbsd-386-9_0
2021-11-05T05:30:39-b68c02e/netbsd-amd64-9_0
2021-11-05T04:20:33-089bfa5-0a5ca24/netbsd-386-9_0
2021-11-05T00:52:06-3839b60/netbsd-386-9_0
2021-11-04T23:35:26-ce13745-256a8fc/netbsd-amd64-9_0
2021-11-04T23:35:26-b76863e-256a8fc/netbsd-386-9_0
2021-11-04T21:53:05-ce13745-76c48e9/netbsd-386-9_0
2021-11-04T21:53:05-37ea4aa-76c48e9/netbsd-amd64-9_0
2021-11-04T21:41:49-39ade5b-156abe5/netbsd-386-9_0
2021-11-04T20:42:35-1f9dce7/netbsd-386-9_0
2021-11-04T20:31:02-39ade5b-978e39e/netbsd-386-9_0
2021-11-04T20:24:01-99699d1/netbsd-386-9_0
2021-11-04T19:34:33-5af93a2/netbsd-amd64-9_0
2021-11-04T18:22:03-39ade5b-b2149ac/netbsd-386-9_0
2021-11-04T17:07:48-5772877/netbsd-386-9_0
2021-11-04T17:07:48-5772877/netbsd-amd64-9_0
2021-11-04T16:36:19-84e69e7-f934b83/netbsd-386-9_0
2021-11-04T14:17:18-7861aae-901bf29/netbsd-386-9_0
2021-11-04T13:55:28-39ade5b-23991f5/netbsd-386-9_0
2021-11-04T13:55:28-30b2efe-23991f5/netbsd-386-9_0
2021-11-04T13:55:24-4a4e1f2-f58c78a/netbsd-amd64-9_0
2021-11-04T07:05:31-84e69e7-2622235/netbsd-386-9_0
2021-11-04T07:05:31-84e69e7-2622235/netbsd-amd64-9_0
2021-11-04T07:05:31-39ade5b-2622235/netbsd-386-9_0
2021-11-04T02:57:53-4a4e1f2-2cf85b1/netbsd-386-9_0
2021-11-04T02:57:53-2cf85b1/netbsd-amd64-9_0
2021-11-04T02:57:48-5fd0c49/netbsd-386-9_0
2021-11-04T02:57:48-5fd0c49/netbsd-amd64-9_0
2021-11-04T00:46:41-84e69e7-e72d715/netbsd-amd64-9_0
2021-11-04T00:36:13-1292e21/openbsd-amd64-68
2021-11-04T00:29:42-a419f2f/netbsd-amd64-9_0
2021-11-04T00:29:42-39ade5b-a419f2f/netbsd-386-9_0
2021-11-04T00:19:43-39ade5b-9cf6711/netbsd-386-9_0
2021-11-03T20:30:17-68536fa-7f2463c/netbsd-386-9_0
2021-11-03T18:37:22-2c98350-cfd016d/netbsd-amd64-9_0
2021-11-03T16:57:44-3a5865c-b212ba6/netbsd-386-9_0
2021-11-03T15:47:47-c143661/netbsd-386-9_0
2021-11-03T05:00:00-39ade5b-519c0a2/netbsd-386-9_0
2021-11-02T23:44:58-a0f373c/netbsd-386-9_0
2021-11-02T22:20:40-39ade5b-2157498/netbsd-386-9_0
2021-11-02T21:21:51-9aacde2-631b567/netbsd-amd64-9_0
2021-11-02T21:21:51-9aacde2-42e6b5b/netbsd-386-9_0
2021-11-02T21:18:39-60fd3ed/netbsd-386-9_0
2021-11-02T20:59:34-6561d8c-79024cf/netbsd-386-9_0
2021-11-02T20:59:34-39ade5b-79024cf/netbsd-386-9_0
2021-11-02T20:33:56-b76863e-b29182b/netbsd-amd64-9_0
2021-11-02T19:37:42-bb4add0-1011e26/netbsd-386-9_0
2021-11-02T18:35:29-a07c284-433ba58/netbsd-amd64-9_0
2021-11-02T18:24:18-bb4add0-f7a95d2/netbsd-386-9_0
2021-11-02T18:19:06-1ba8fdb-631b567/netbsd-amd64-9_0
2021-11-02T18:01:20-39ade5b-629ffeb/netbsd-386-9_0
2021-11-02T18:01:20-058ed05-629ffeb/netbsd-amd64-9_0
2021-11-02T17:31:50-4e7dd9f/openbsd-386-68
2021-11-02T17:31:50-39ade5b-4e7dd9f/netbsd-386-9_0
2021-11-02T17:01:01-af8aafd/netbsd-386-9_0
2021-11-02T16:57:03-c406380/netbsd-amd64-9_0
2021-11-02T16:19:52-c96bc14-58fb05a/netbsd-386-9_0
2021-11-02T16:12:28-f801da7/netbsd-amd64-9_0
2021-11-02T16:12:23-599de4b/netbsd-amd64-9_0
2021-11-02T15:54:27-058ed05-631b567/netbsd-amd64-9_0
2021-11-02T06:25:39-30b2efe-088bb4b/netbsd-amd64-9_0
2021-11-02T03:55:19-39ade5b-6f1e9a9/netbsd-386-9_0
2021-11-02T03:09:01-a45457d/netbsd-amd64-9_0
2021-11-02T00:12:17-4ff95d6-81fea0b/netbsd-386-9_0
2021-11-01T21:27:26-39ade5b-631b567/netbsd-386-9_0
2021-11-01T21:27:26-2c98350-631b567/netbsd-amd64-9_0
2021-11-01T21:27:26-1f47c86-631b567/netbsd-386-9_0
2021-11-01T16:31:02-39ade5b-2bcf1c0/netbsd-amd64-9_0
2021-11-01T16:31:02-2bcf1c0/netbsd-386-9_0
2021-11-01T15:55:25-e2e910e/netbsd-386-9_0
2021-11-01T13:12:37-611d5d6-4056934/netbsd-amd64-9_0
2021-11-01T13:08:16-732db40/openbsd-386-68
2021-10-31T18:39:05-39ade5b-89c5270/netbsd-386-9_0
2021-10-31T18:39:05-12ab535-89c5270/netbsd-amd64-9_0
2021-10-31T18:13:09-39ade5b-fd09e88/netbsd-amd64-9_0
2021-10-31T08:29:02-39ade5b-8e3d5f0/netbsd-amd64-9_0
2021-10-30T18:30:34-ba495a6-6113dac/netbsd-386-9_0
2021-10-30T16:46:47-d1dceaf/netbsd-386-9_0
2021-10-30T16:45:25-e39b854/netbsd-386-9_0
2021-10-30T16:08:13-b3129d9-5d6d9f5/netbsd-386-9_0
2021-10-30T00:47:26-a6c6f4b-d19c5bd/netbsd-amd64-9_0
2021-10-30T00:47:26-a6c6f4b-5d6d9f5/netbsd-386-9_0
2021-10-30T00:47:26-a6c6f4b-4a84298/netbsd-386-9_0
2021-10-29T22:29:31-c96bc14-c812b97/netbsd-386-9_0
2021-10-29T22:29:31-39ade5b-c812b97/netbsd-386-9_0
2021-10-29T22:27:54-71e6ab8/netbsd-amd64-9_0
2021-10-29T22:27:26-3571ab5/netbsd-386-9_0
2021-10-29T21:19:39-a2be0cd-d19c5bd/netbsd-amd64-9_0
2021-10-29T21:19:39-a2be0cd-994049a/netbsd-amd64-9_0
2021-10-29T17:44:15-d8fc7f7/netbsd-amd64-9_0
2021-10-29T02:12:05-d6a9af8-d3d8852/netbsd-amd64-9_0
2021-10-28T21:17:17-089bfa5-f6f024f/netbsd-386-9_0
2021-10-28T20:43:39-b954024/netbsd-386-9_0
2021-10-28T18:48:25-e7eb6f6-9004433/netbsd-386-9_0
2021-10-28T18:01:38-39ade5b-9004433/netbsd-386-9_0
2021-10-28T15:35:25-39ade5b-278b9b3/netbsd-amd64-9_0
2021-10-28T15:31:34-8de2a7f-8c9c148/netbsd-amd64-9_0
2021-10-28T15:08:31-26ed8fd-5c98bcb/netbsd-amd64-9_0
2021-10-28T03:35:34-b8f928b/netbsd-amd64-9_0
2021-10-28T02:35:22-39ade5b-056dfe6/netbsd-amd64-9_0
2021-10-28T01:15:26-b2fe2eb/netbsd-386-9_0
2021-10-27T21:37:54-7b0b504-749f6e9/netbsd-amd64-9_0
2021-10-27T20:29:07-51be206/netbsd-386-9_0
2021-10-27T20:25:06-bbc0595/netbsd-amd64-9_0
2021-10-27T16:59:43-c0ac39c/netbsd-amd64-9_0
2021-10-27T16:39:27-39ade5b-4f73fd0/netbsd-386-9_0
2021-10-27T08:50:27-39ade5b-bdefb77/netbsd-386-9_0
2021-10-26T22:00:36-26dbf47-86f6bf1/netbsd-amd64-9_0
2021-10-26T20:41:32-03fcf44-3a0cd11/netbsd-amd64-9_0
2021-10-26T18:40:06-9626607-11b64b4/netbsd-386-9_0
2021-10-26T18:33:39-244f92e-11b64b4/netbsd-amd64-9_0
2021-10-26T15:20:53-39ade5b-1b2362b/netbsd-386-9_0
2021-10-26T15:20:53-036812b-1b2362b/netbsd-amd64-9_0
2021-10-26T14:24:17-39ade5b-283d8a3/netbsd-386-9_0
2021-10-26T14:05:47-39ade5b-a2b8c18/netbsd-386-9_0
2021-10-26T11:58:05-1e2820a/netbsd-386-9_0
2021-10-26T05:05:24-23fdd7f/openbsd-386-68
2021-10-26T02:02:46-903c757-11b64b4/netbsd-amd64-9_0
2021-10-26T01:22:47-adfb85b/netbsd-386-9_0
2021-10-26T01:22:47-adfb85b/openbsd-386-68
2021-10-19T18:56:08-07e5527/netbsd-386-9_0
2021-10-19T15:48:56-98f6e03-067d796/openbsd-386-64
2021-10-19T07:45:46-ee92daa/netbsd-amd64-9_0
2021-10-19T07:45:46-98f6e03-ee92daa/netbsd-386-9_0
2021-10-18T21:57:36-98f6e03-3befaf0/netbsd-amd64-9_0
2021-10-14T04:18:44-09e6c7a-9e4dc6f/openbsd-386-64
2021-10-12T21:14:34-03971e3-c580180/netbsd-amd64-9_0
2021-10-07T19:49:45-39ade5b-c580180/netbsd-386-9_0
2021-09-30T20:30:12-1c35f2a-c035d82/openbsd-386-64
2021-09-22T22:01:15-c8db761-1537f14/openbsd-386-64
2021-09-10T02:44:36-2091bd3/plan9-386-0intro
2021-09-01T19:23:15-8373dc3-592ee43/plan9-arm
2021-05-06T18:08:01-402f177/plan9-386-0intro
2021-05-06T16:28:56-5f9fe47/plan9-386-0intro
2021-04-01T15:50:43-45ca9ef/linux-arm-scaleway
2021-04-01T01:26:29-5f646f0/linux-arm-scaleway
2021-04-01T00:51:26-ec721d9/linux-arm-scaleway
2021-04-01T00:51:24-1f29e69/linux-arm-scaleway
2021-04-01T00:51:23-3304b22/linux-arm-scaleway
2021-03-31T20:21:57-5d6581d/linux-arm-scaleway
2020-12-22T20:00:00-c9fb4eb/windows-386-2008
2020-12-15T16:31:07-a508840/plan9-arm
2020-12-15T16:30:24-5046cb8/plan9-arm
2020-12-02T17:00:06-3d913a9/plan9-arm
2020-11-20T16:42:46-5e58ae4/plan9-arm
2020-11-19T19:30:38-e73697b/plan9-arm
2020-11-19T02:17:10-0bb6115/plan9-arm
2020-11-11T20:51:00-141fa33/plan9-arm
2020-10-29T19:06:32-5cc43c5/plan9-arm
2020-10-28T00:25:05-40d1ec5/plan9-arm
2020-07-07T16:15:30-3b6b86d/plan9-386-0intro
2020-05-05T18:32:35-8627b4c/plan9-arm
2020-05-05T18:05:10-a8e83d5/plan9-arm
2020-05-05T15:41:37-b4ecafc/plan9-arm
2020-05-05T05:13:26-9b18968/plan9-arm
2020-05-05T00:36:44-c9d5f60/plan9-arm
2020-05-04T17:40:00-a1ffbe9/plan9-arm

Note the inflection point on the BSD builders around 2021-09-22. This may or may not share an underlying cause with #49209.

@bcmills
Copy link
Contributor Author

bcmills commented Nov 8, 2021

This appears to be a regression in Go 1.18.

Since NetBSD and OpenBSD are not first-class ports, this doesn't necessarily block the 1.18 release — however, if the regression remains at the time of the release it at least needs a clear writeup in the release notes. (That part, at least, is a release-blocker.)

@bcmills bcmills added NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. release-blocker labels Nov 8, 2021
@bcmills bcmills added this to the Go1.18 milestone Nov 8, 2021
@bcmills
Copy link
Contributor Author

bcmills commented Nov 12, 2021

Curiously, at least the 2021-11-12T03:54:29-ea63613-3729a67/netbsd-amd64-9_0 record is on release-branch.go1.16. I'm not sure how to rectify that with the timing of the first BSD failures in September.

Maybe the regression was introduced by a backported change? I see a lot of failures involving updateTimerPMask, and a few spurious errors in my previous greplogs from tests that just dump all of the strings in the runtime.

Trying again with a more precise pattern:

greplogs --dashboard -md -l -e '(?m)fatal error:.*\n.*\n(?:\n(runtime stack|goroutine).*:\n(?:.+\n\t.+\n)*)+runtime\.unlock2\(.*\n\t.+\n(?:.+\n\t.+\n)*runtime\.updateTimerPMask\('

2021-11-12T03:54:29-ea63613-3729a67/netbsd-amd64-9_0
2021-11-11T21:27:05-c116b72-c3c4a2b/netbsd-386-9_0
2021-11-11T20:34:56-a07c284-10d3b13/netbsd-386-9_0
2021-11-11T20:34:56-0d747a3-10d3b13/netbsd-386-9_0
2021-11-11T20:25:49-39ade5b-c622d1d/netbsd-386-9_0
2021-11-11T19:46:03-c116b72-48f1cde/netbsd-386-9_0
2021-11-11T19:34:23-39ade5b-8ce1a95/netbsd-amd64-9_0
2021-11-11T18:32:21-c33205f-84277bf/netbsd-386-9_0
2021-11-11T18:32:21-c116b72-84277bf/netbsd-386-9_0
2021-11-11T15:34:02-6944b10-8c73f80/netbsd-386-9_0
2021-11-11T13:58:28-d76b1ac/netbsd-386-9_0
2021-11-11T11:47:33-c49627e/netbsd-amd64-9_0
2021-11-11T11:47:08-e9ef931/netbsd-amd64-9_0
2021-11-11T07:03:15-036812b-1ec5108/netbsd-amd64-9_0
2021-11-11T04:54:05-4b27d40/openbsd-386-68
2021-11-10T20:45:04-fc3ed20-a881409/netbsd-386-9_0
2021-11-10T20:04:06-d3172f2-b2d826c/netbsd-386-9_0
2021-11-10T18:06:32-097aaa9/netbsd-amd64-9_0
2021-11-10T18:06:32-097aaa9/openbsd-386-68
2021-11-10T17:15:52-0aa194f/netbsd-386-9_0
2021-11-09T22:58:24-578ada4/netbsd-amd64-9_0
2021-11-09T22:05:25-39ade5b-a65a095/netbsd-amd64-9_0
2021-11-09T21:26:25-6635138-b93220c/netbsd-amd64-9_0
2021-11-09T21:07:06-39ade5b-f981a9f/netbsd-amd64-9_0
2021-11-09T20:10:36-77c473f/openbsd-386-68
2021-11-09T17:18:08-15a54d6/netbsd-amd64-9_0
2021-11-09T17:18:08-15a54d6/openbsd-386-68
2021-11-09T16:49:01-e900012-3729a67/netbsd-amd64-9_0
2021-11-09T01:45:54-fafc446-5344dca/netbsd-386-9_0
2021-11-08T21:52:47-a5321bf-955f9f5/netbsd-386-9_0
2021-11-08T21:45:43-cc49178/netbsd-amd64-9_0
2021-11-08T20:32:21-b766c28-830b393/netbsd-386-9_0
2021-11-08T18:06:16-39ade5b-5e64755/netbsd-amd64-9_0
2021-11-08T17:07:45-6635138-7bda349/netbsd-386-9_0
2021-11-08T16:44:18-7bda349/openbsd-386-68
2021-11-08T16:15:01-30b2efe-759eaa2/netbsd-386-9_0
2021-11-08T14:49:56-18b340f-ab31dbc/netbsd-386-9_0
2021-11-07T04:56:11-85493d5/netbsd-386-9_0
2021-11-06T19:41:15-036812b-61d789d/netbsd-386-9_0
2021-11-06T16:43:43-39ade5b-3544082/netbsd-386-9_0
2021-11-06T10:24:44-0c60b7c-f19e400/netbsd-amd64-9_0
2021-11-06T00:29:44-a66bbe2-b74f2ef/netbsd-386-9_0
2021-11-05T23:32:57-b8b8e7f-09e8de7/netbsd-386-9_0
2021-11-05T22:57:08-09e8de7/openbsd-386-68
2021-11-05T22:54:47-ba79c1e/netbsd-386-9_0
2021-11-05T22:54:47-ba79c1e/netbsd-amd64-9_0
2021-11-05T22:30:17-03971e3-b07c41d/netbsd-386-9_0
2021-11-05T22:26:07-4ab7496-d3a80c7/netbsd-386-9_0
2021-11-05T22:26:07-39ade5b-d3a80c7/netbsd-amd64-9_0
2021-11-05T22:00:37-f00b43f/openbsd-386-68
2021-11-05T21:48:25-4ab7496-6b223e8/netbsd-386-9_0
2021-11-05T21:27:34-7aed6dd/netbsd-386-9_0
2021-11-05T21:26:54-3e9e024/openbsd-386-68
2021-11-05T21:13:38-39ade5b-091948a/netbsd-amd64-9_0
2021-11-05T20:06:27-a7b6526-7be227c/netbsd-386-9_0
2021-11-05T20:06:27-a07c284-7be227c/netbsd-386-9_0
2021-11-05T17:52:30-df18377/netbsd-amd64-9_0
2021-11-05T17:47:28-6f32d20/netbsd-386-9_0
2021-11-05T17:47:28-6f32d20/netbsd-amd64-9_0
2021-11-05T17:23:06-37951d8/openbsd-386-68
2021-11-05T17:17:30-39ade5b-62c6ff4/netbsd-386-9_0
2021-11-05T16:54:01-a07c284-3796df1/netbsd-386-9_0
2021-11-05T04:20:33-089bfa5-0a5ca24/netbsd-386-9_0
2021-11-05T00:52:06-3839b60/netbsd-386-9_0
2021-11-04T23:35:26-ce13745-256a8fc/netbsd-amd64-9_0
2021-11-04T21:41:49-39ade5b-156abe5/netbsd-386-9_0
2021-11-04T20:42:35-1f9dce7/netbsd-386-9_0
2021-11-04T20:31:02-39ade5b-978e39e/netbsd-386-9_0
2021-11-04T20:24:01-99699d1/netbsd-386-9_0
2021-11-04T18:22:03-39ade5b-b2149ac/netbsd-386-9_0
2021-11-04T17:07:48-5772877/netbsd-amd64-9_0
2021-11-04T16:36:19-84e69e7-f934b83/netbsd-386-9_0
2021-11-04T14:17:18-7861aae-901bf29/netbsd-386-9_0
2021-11-04T13:55:28-39ade5b-23991f5/netbsd-386-9_0
2021-11-04T13:55:28-39ade5b-23991f5/netbsd-amd64-9_0
2021-11-04T13:55:24-4a4e1f2-f58c78a/netbsd-amd64-9_0
2021-11-04T07:05:31-84e69e7-2622235/netbsd-amd64-9_0
2021-11-04T07:05:31-39ade5b-2622235/netbsd-386-9_0
2021-11-04T02:57:53-2cf85b1/netbsd-amd64-9_0
2021-11-04T02:57:48-5fd0c49/netbsd-amd64-9_0
2021-11-04T00:46:41-84e69e7-e72d715/netbsd-amd64-9_0
2021-11-04T00:36:13-1292e21/openbsd-amd64-68
2021-11-04T00:29:42-a419f2f/netbsd-amd64-9_0
2021-11-04T00:29:42-39ade5b-a419f2f/netbsd-386-9_0
2021-11-04T00:19:43-39ade5b-9cf6711/netbsd-386-9_0
2021-11-03T18:37:22-2c98350-cfd016d/netbsd-amd64-9_0
2021-11-03T16:57:44-3a5865c-b212ba6/netbsd-386-9_0
2021-11-03T05:00:00-39ade5b-519c0a2/netbsd-386-9_0
2021-11-02T23:44:58-a0f373c/netbsd-386-9_0
2021-11-02T22:20:40-39ade5b-2157498/netbsd-386-9_0
2021-11-02T21:21:51-9aacde2-631b567/netbsd-amd64-9_0
2021-11-02T21:21:51-9aacde2-42e6b5b/netbsd-386-9_0
2021-11-02T21:18:39-60fd3ed/netbsd-386-9_0
2021-11-02T20:59:34-6561d8c-79024cf/netbsd-386-9_0
2021-11-02T20:59:34-39ade5b-79024cf/netbsd-386-9_0
2021-11-02T20:33:56-b76863e-b29182b/netbsd-amd64-9_0
2021-11-02T19:37:42-bb4add0-1011e26/netbsd-386-9_0
2021-11-02T18:35:29-a07c284-433ba58/netbsd-amd64-9_0
2021-11-02T18:24:18-bb4add0-f7a95d2/netbsd-386-9_0
2021-11-02T18:19:06-1ba8fdb-631b567/netbsd-amd64-9_0
2021-11-02T18:01:20-39ade5b-629ffeb/netbsd-386-9_0
2021-11-02T18:01:20-058ed05-629ffeb/netbsd-amd64-9_0
2021-11-02T17:31:50-4e7dd9f/openbsd-386-68
2021-11-02T17:31:50-39ade5b-4e7dd9f/netbsd-386-9_0
2021-11-02T17:01:01-af8aafd/netbsd-386-9_0
2021-11-02T16:57:03-c406380/netbsd-amd64-9_0
2021-11-02T16:19:52-c96bc14-58fb05a/netbsd-386-9_0
2021-11-02T16:12:28-f801da7/netbsd-amd64-9_0
2021-11-02T16:12:23-599de4b/netbsd-amd64-9_0
2021-11-02T03:55:19-39ade5b-6f1e9a9/netbsd-386-9_0
2021-11-02T03:09:01-a45457d/netbsd-amd64-9_0
2021-11-02T00:12:17-4ff95d6-81fea0b/netbsd-386-9_0
2021-11-01T21:27:26-39ade5b-631b567/netbsd-386-9_0
2021-11-01T21:27:26-2c98350-631b567/netbsd-amd64-9_0
2021-11-01T21:27:26-1f47c86-631b567/netbsd-386-9_0
2021-11-01T16:31:02-39ade5b-2bcf1c0/netbsd-amd64-9_0
2021-11-01T16:31:02-2bcf1c0/netbsd-386-9_0
2021-11-01T15:55:25-e2e910e/netbsd-386-9_0
2021-11-01T13:12:37-611d5d6-4056934/netbsd-amd64-9_0
2021-11-01T13:08:16-732db40/openbsd-386-68
2021-10-31T18:39:05-39ade5b-89c5270/netbsd-386-9_0
2021-10-31T18:39:05-12ab535-89c5270/netbsd-amd64-9_0
2021-10-31T18:13:09-39ade5b-fd09e88/netbsd-amd64-9_0
2021-10-31T08:29:02-39ade5b-8e3d5f0/netbsd-amd64-9_0
2021-10-30T18:30:34-ba495a6-6113dac/netbsd-386-9_0
2021-10-30T16:46:47-d1dceaf/netbsd-386-9_0
2021-10-30T16:45:25-e39b854/netbsd-386-9_0
2021-10-30T16:08:13-b3129d9-5d6d9f5/netbsd-386-9_0
2021-10-30T00:47:26-a6c6f4b-4a84298/netbsd-386-9_0
2021-10-29T22:29:31-c96bc14-c812b97/netbsd-386-9_0
2021-10-29T22:29:31-39ade5b-c812b97/netbsd-386-9_0
2021-10-29T22:27:54-71e6ab8/netbsd-amd64-9_0
2021-10-29T22:27:26-3571ab5/netbsd-386-9_0
2021-10-29T21:19:39-a2be0cd-d19c5bd/netbsd-amd64-9_0
2021-10-28T21:17:17-089bfa5-f6f024f/netbsd-386-9_0
2021-10-28T18:01:38-39ade5b-9004433/netbsd-386-9_0
2021-10-28T02:35:22-39ade5b-056dfe6/netbsd-amd64-9_0
2021-10-28T01:15:26-b2fe2eb/netbsd-386-9_0
2021-10-27T20:29:07-51be206/netbsd-386-9_0
2021-10-27T16:39:27-39ade5b-4f73fd0/netbsd-386-9_0
2021-10-27T08:50:27-39ade5b-bdefb77/netbsd-386-9_0
2021-10-26T18:40:06-9626607-11b64b4/netbsd-386-9_0
2021-10-26T18:33:39-244f92e-11b64b4/netbsd-amd64-9_0
2021-10-26T15:20:53-39ade5b-1b2362b/netbsd-386-9_0
2021-10-26T15:20:53-036812b-1b2362b/netbsd-amd64-9_0
2021-10-26T14:24:17-39ade5b-283d8a3/netbsd-386-9_0
2021-10-26T14:05:47-39ade5b-a2b8c18/netbsd-386-9_0
2021-10-26T11:58:05-1e2820a/netbsd-386-9_0
2021-10-26T05:05:24-23fdd7f/openbsd-386-68
2021-10-26T02:02:46-903c757-11b64b4/netbsd-amd64-9_0
2021-10-26T01:22:47-adfb85b/openbsd-386-68
2021-10-19T18:56:08-07e5527/netbsd-386-9_0
2021-10-19T07:45:46-98f6e03-ee92daa/netbsd-386-9_0
2021-10-14T04:18:44-09e6c7a-9e4dc6f/openbsd-386-64
2021-10-07T19:49:45-39ade5b-c580180/netbsd-386-9_0

@bcmills bcmills changed the title runtime: frequent crashes in unlock2 on NetBSD and OpenBSD since 2021-09-22 runtime: frequent crashes in unlock2 on NetBSD and OpenBSD since 2021-10-07 Nov 12, 2021
@bcmills bcmills changed the title runtime: frequent crashes in unlock2 on NetBSD and OpenBSD since 2021-10-07 runtime: frequent crashes in unlock2 via updateTimerPMask on NetBSD and OpenBSD since 2021-10-07 Nov 12, 2021
@bcmills
Copy link
Contributor Author

bcmills commented Nov 12, 2021

@prattmic, there seems to be a strong connection between the NetBSD crashes and updateTimerPMask. (I would suspect CL 264477, but the timing for that being the cause is off by about a year! 😅)

Any theories?

@jeremyfaller
Copy link
Contributor

CC @prattmic

@prattmic
Copy link
Member

The crash is at https://cs.opensource.google/go/go/+/master:src/runtime/lock_sema.go;l=115;drc=f229e7031a6efb2f23241b5da000c3b3203081d6. The crash is always on addr 0x170 (386) or 0x280 (amd64). I believe this is the mp.nextwaitm access, with mp as nil.

I'm not sure yet how things get this way.

The updateTimerPMask code is related (though not directly) to some workarounds on NetBSD that were relaxed on 2021-08-16 in https://golang.org/cl/324472, so perhaps there is some relation there.

@prattmic prattmic self-assigned this Nov 12, 2021
@prattmic
Copy link
Member

I'm having trouble reproducing this just because there are so many other netbsd memory corruption crashes I'm hitting... (#49209)

But as an interesting data point, I've noticed thus far that every all.bash run I've done has either had all tests pass, or had numerous different crashes. I've yet to have just a single test crash. That could indicate something about the environment is making crashes more likely sometimes.

@prattmic
Copy link
Member

Two OpenBSD crashes in unlock2 that do not involve updateTimerPMask:

https://build.golang.org/log/c6898314cb2fdaa7a39437c2186b5e8a804d47f3
https://build.golang.org/log/afdc1f994a11b5041fa0e3dff06cc4ce8c2070ef

@prattmic
Copy link
Member

prattmic commented Nov 17, 2021

I believe some of the logs above show this as well, but it hasn't yet been pointed out that this crash sometimes occurs in the Go 1.4 bootstrap toolchain as well. Example below.

The line numbers here don't quite line up to the mp.nextwaitm line (src/runtime/lock_sema.go:107 in a comment in the copy of 1.4 I'm looking at), but it seems likely to be the same.

IMO, this indicates that we either have a bug that has been around since at least 1.4, or there is some more general system-provoked memory corruption going on.

fatal error: unexpected signal during runtime execution
[signal SIGSEGV: segmentation violation code=0x1 addr=0x1f8 pc=0x80553c7]

goroutine 20 [running]:
runtime.throw(0x8428440, 0x2a)
        /home/bradfitz/go-netbsd-386-bootstrap/src/runtime/panic.go:616 +0x6b fp=0x18971444 sp=0x18971438 pc=0x806e24b
runtime.sigpanic()
        /home/bradfitz/go-netbsd-386-bootstrap/src/runtime/signal_unix.go:366 +0x230 fp=0x18971468 sp=0x18971444 pc=0x8081480
runtime.unlock(0xbba62f84)
        /home/bradfitz/go-netbsd-386-bootstrap/src/runtime/lock_sema.go:107 +0x97 fp=0x18971484 sp=0x18971468 pc=0x80553c7
internal/poll.runtime_pollOpen(0xb, 0x8442d54, 0x0)
        /home/bradfitz/go-netbsd-386-bootstrap/src/runtime/netpoll.go:120 +0xa2 fp=0x1897149c sp=0x18971484 pc=0x806ae92
internal/poll.(*pollDesc).init(0x18d081d4, 0x18d081c0, 0x18a64001, 0x18d081c0)
        /home/bradfitz/go-netbsd-386-bootstrap/src/internal/poll/fd_poll_runtime.go:37 +0x3a fp=0x189714bc sp=0x1897149c pc=0x80c2cea
internal/poll.(*FD).Init(0x18d081c0, 0x84161bd, 0x4, 0x1, 0x8, 0x18)
        /home/bradfitz/go-netbsd-386-bootstrap/src/internal/poll/fd_unix.go:58 +0x47 fp=0x189714d0 sp=0x189714bc pc=0x80c3787
os.newFile(0xb, 0x841804b, 0x9, 0x1, 0xb)
        /home/bradfitz/go-netbsd-386-bootstrap/src/os/file_unix.go:117 +0xd0 fp=0x189714f8 sp=0x189714d0 pc=0x80cb080
os.OpenFile(0x841804b, 0x9, 0x0, 0x0, 0x8, 0x18, 0x555)
        /home/bradfitz/go-netbsd-386-bootstrap/src/os/file_unix.go:198 +0xc4 fp=0x1897152c sp=0x189714f8 pc=0x80cb274
os.Open(0x841804b, 0x9, 0x57, 0xbba657ac, 0x0)
        /home/bradfitz/go-netbsd-386-bootstrap/src/os/file.go:241 +0x33 fp=0x1897154c sp=0x1897152c pc=0x80c95f3
os/exec.(*Cmd).stdin(0x18a7c000, 0x18bec160, 0x86296a0, 0xbbabe6c8)
        /home/bradfitz/go-netbsd-386-bootstrap/src/os/exec/exec.go:210 +0x3bb fp=0x189715b8 sp=0x1897154c pc=0x81153ab
os/exec.(*Cmd).Start(0x18a7c000, 0x18d260c0, 0x18bec420)
        /home/bradfitz/go-netbsd-386-bootstrap/src/os/exec/exec.go:363 +0xbf fp=0x18971664 sp=0x189715b8 pc=0x8115c5f
os/exec.(*Cmd).Run(0x18a7c000, 0x0, 0x0)
        /home/bradfitz/go-netbsd-386-bootstrap/src/os/exec/exec.go:297 +0x1b fp=0x18971674 sp=0x18971664 pc=0x8115b5b
cmd/go/internal/work.(*Builder).runOut(0x18bb0120, 0x18a3e960, 0x1b, 0x18a65c51, 0x4, 0x0, 0x0, 0x0, 0x18900d20, 0x11, ...)
        /home/bradfitz/go-netbsd-386-bootstrap/src/cmd/go/internal/work/exec.go:1422 +0x5cb fp=0x1897172c sp=0x18971674 pc=0x81f49ab
cmd/go/internal/work.(*Builder).run(0x18bb0120, 0x18ceab00, 0x18a3e960, 0x1b, 0x18a65c51, 0x4, 0x0, 0x0, 0x0, 0x18900d20, ...)
        /home/bradfitz/go-netbsd-386-bootstrap/src/cmd/go/internal/work/exec.go:1363 +0x81 fp=0x18971794 sp=0x1897172c pc=0x81f4091
cmd/go/internal/work.gcToolchain.asm(0x18bb0120, 0x18ceab00, 0x1897e680, 0x1a, 0x1a, 0x18971930, 0x2, 0x2, 0x18971b60, 0x4)
        /home/bradfitz/go-netbsd-386-bootstrap/src/cmd/go/internal/work/gc.go:236 +0x947 fp=0x18971900 sp=0x18971794 pc=0x8200927
cmd/go/internal/work.(*gcToolchain).asm(0x863b438, 0x18bb0120, 0x18ceab00, 0x1897e680, 0x1a, 0x1a, 0x4, 0x8415c20, 0x3, 0x18971ba0, ...)
        <autogenerated>:1 +0x4a fp=0x1897192c sp=0x18971900 pc=0x821194a
cmd/go/internal/work.(*Builder).build(0x18bb0120, 0x18ceab00, 0x0, 0x0)
        /home/bradfitz/go-netbsd-386-bootstrap/src/cmd/go/internal/work/exec.go:598 +0x2710 fp=0x18971f30 sp=0x1897192c pc=0x81ed090
cmd/go/internal/work.(*Builder).Do.func1(0x18ceab00)
        /home/bradfitz/go-netbsd-386-bootstrap/src/cmd/go/internal/work/exec.go:101 +0x56 fp=0x18971f70 sp=0x18971f30 pc=0x820fae6
cmd/go/internal/work.(*Builder).Do.func2(0x18ac47c0, 0x18bb0120, 0x18c3c500)
        /home/bradfitz/go-netbsd-386-bootstrap/src/cmd/go/internal/work/exec.go:159 +0x80 fp=0x18971fe0 sp=0x18971f70 pc=0x820fd40
runtime.goexit()
        /home/bradfitz/go-netbsd-386-bootstrap/src/runtime/asm_386.s:1665 +0x1 fp=0x18971fe4 sp=0x18971fe0 pc=0x8094531
created by cmd/go/internal/work.(*Builder).Do
        /home/bradfitz/go-netbsd-386-bootstrap/src/cmd/go/internal/work/exec.go:146 +0x2dc

Edit: The disassembly of the bootstrap compiler supports that this is the load of mp.nextwaitm.

@prattmic
Copy link
Member

Offline it was noted that https://golang.org/cl/354757 was submitted around the time this failure ramped up (though not completely before). It is possible that the new CPU type makes some kind of race in our code more likely, or less likely, doesn't work properly with the OS.


Looking more carefully at the lock code, this crash could simply be from calling unlock2 on an already unlocked mutex. i.e., we assume that either l.key == locked (no waiters), or the masked value is an M pointer. If l.key is 0 (unlocked, no waiters), then we will do the nil dereference we see in the crashes.

Normally, that would (well, could) throw "lock count", but that check comes after where this crash occurs.

@bcmills bcmills changed the title runtime: frequent crashes in unlock2 via updateTimerPMask on NetBSD and OpenBSD since 2021-10-07 runtime: frequent crashes in unlock2 on NetBSD and OpenBSD since 2021-10-07 Nov 18, 2021
@bcmills
Copy link
Contributor Author

bcmills commented Nov 18, 2021

I think the correlation with updateTimerPMask was just a red herring —there are lots of unlock2 crashes on the new openbsd-386-70 builder that do not appear to involve updateTimerPMask at all. (Maybe updateTimerPMask is just a frequent cause of “acquiring contended locks in the runtime” or something like that?)

@prattmic
Copy link
Member

So far I've been having significant trouble reproducing these crashes once I added debuglog messages to lock2/unlock2, which is rather annoying but perhaps a good indication that there is a race happening here.

Interestingly, I am still reproducing the scanstack and freeIndex bugs from #49209 (cc @mknyszek).

@toothrot toothrot added the okay-after-beta1 Used by release team to mark a release-blocker issue as okay to resolve either before or after beta1 label Nov 19, 2021
@prattmic
Copy link
Member

The current state of my reproduction / investigation is in https://golang.org/cl/365316.

@gopherbot
Copy link
Contributor

Change https://golang.org/cl/365316 mentions this issue: DO NOT SUBMIT: investigate unlock2 crashes

@mknyszek
Copy link
Contributor

mknyszek commented Nov 19, 2021

Alongside the unlock2 failures (i.e. same run as an unlock2 failure), both Michael and myself have seen crashes in the stack allocator, like

2021/11/17 23:15:24 loading imports: signal: abort trap (core dumped)
fatal error: unexpected signal during runtime execution
[signal SIGSEGV: segmentation violation code=0x1 addr=0x0 pc=0x8091d0f]

runtime stack:
runtime.throw({0x85e7bdf, 0x2a})
    /tmp/workdir/go/src/runtime/panic.go:992 +0x64 fp=0xaab5dd34 sp=0xaab5dd20 pc=0x807afa4
runtime.sigpanic()
    /tmp/workdir/go/src/runtime/signal_unix.go:781 +0x22b fp=0xaab5dd4c sp=0xaab5dd34 pc=0x808feeb
runtime.stackalloc(0x2000)
    /tmp/workdir/go/src/runtime/stack.go:393 +0x23f fp=0xaab5dd8c sp=0xaab5dd4c pc=0x8091d0f
runtime.copystack(0x8c01d10, 0x2000)
    /tmp/workdir/go/src/runtime/stack.go:871 +0x98 fp=0xaab5deb4 sp=0xaab5dd8c pc=0x8092818
runtime.newstack()
    /tmp/workdir/go/src/runtime/stack.go:1110 +0x444 fp=0xaab5df8c sp=0xaab5deb4 pc=0x8092f14
runtime.morestack()
    /tmp/workdir/go/src/runtime/asm_386.s:449 +0x73 fp=0xaab5df90 sp=0xaab5df8c pc=0x80a5a93

I have a core for this, and digging into it, it looks like it crashes with an address of 0 (nil pointer dereference) but when I look at the value in memory that the register is loaded from ((*mcache).stackalloc[2].list), it's definitely not zero.

Like the freeIndex failures, this again looks like a local variable is getting corrupted in some way. In this specific instance, it's zeroed (which may line up with other unlock2 failures). @prattmic suggested that if this is a kernel bug, it could be that returning from a signal handler isn't correctly restoring registers.

What's interesting here too is that these are crashes in the runtime. If, say, an asynchronous preemption signal lands while a thread is in the runtime, it should just return as a no-op. Maybe the reason why we're not seeing failures outside the runtime is because we do our own register spill and restore by injecting the asynchronous preemption onto the stack. However, I think this might be ruled out because the post-signal-handler spill should also then observe save and restore some bad value obtained by returning from the signal handler.

@gopherbot
Copy link
Contributor

Change https://golang.org/cl/367847 mentions this issue: DO NOT SUBMIT: add cputicks after atomics for synchronization

@aclements
Copy link
Member

We're pretty confident that this has the same root cause as #49209, so closing in favor of the broader issue.

@prattmic prattmic self-assigned this Jun 24, 2022
@golang golang locked and limited conversation to collaborators Jun 24, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
FrozenDueToAge NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. okay-after-beta1 Used by release team to mark a release-blocker issue as okay to resolve either before or after beta1 release-blocker
Projects
None yet
Development

No branches or pull requests

7 participants