Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[COMPLETE SYSTEM HANG] "watchdog: BUG: soft lockup - CPU#6 stuck for 23s! [brave:75851]" #17687

Closed
bill88t opened this issue Aug 25, 2021 · 2 comments

Comments

@bill88t
Copy link

bill88t commented Aug 25, 2021

Description

After attempting to close Brave, I got a complete system freeze.
CPU#6 was stuck on brave and brought the machine down with it..
It may be related to nvidia graphics drivers, but I really don't know.
Note: I have set brave to NOT run in the backround after closing. So all the threads should started to terminate.
no brave crashlog has been generated

Ubuntu 20.04.3 LTS x86_64 5.4.0-81-generic
Intel i5-10400F (12) @ 4.300GHz
NVIDIA GeForce GT 1030

Steps to Reproduce

Atm, I don't know how to reproduce it. It's a one off.
Still, such a cpu lockup should not have happened. EVER.

Brave version (brave://version info)

Brave 1.28.106 Chromium: 92.0.4515.159 (Official Build) (64-bit)
Revision 0185b8a19c88c5dfd3e6c0da6686d799e9bc3b52-refs/branch-heads/4515@{#2052}
OS Linux
JavaScript V8 9.2.230.29
User Agent Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/92.0.4515.159 Safari/537.36
Command Line /opt/brave.com/brave/brave --enable-crashpad --enable-dom-distiller --disable-domain-reliability --no-pings --origin-trial-public-key=bYUKPJoPnCxeNvu72j4EmPuK7tr1PAC7SHh8ld9Mw3E=,fMS4mpO6buLQ/QMd+zJmxzty/VQ6B1EUZqoCU04zoRU= --sync-url=https://sync-v2.brave.com/v2 --lso-url=https://no-thanks.invalid --variations-server-url=https://variations.brave.com/seed --enable-features=WebUIDarkMode,PasswordImport,ReducedReferrerGranularity,PrefetchPrivacyChanges,AutoupgradeMixedContent,SafetyTip,LegacyTLSEnforced,DnsOverHttps --disable-features=FirstPartySets,HandwritingRecognitionWebPlatformApi,FlocIdComputedEventLogging,HandwritingRecognitionWebPlatformApiFinch,EnableProfilePickerOnStartup,TextFragmentAnchor,AutofillEnableAccountWalletStorage,FledgeInterestGroupAPI,TrustTokens,DirectSockets,WebOTP,NotificationTriggers,InterestCohortFeaturePolicy,FledgeInterestGroups,SignedExchangePrefetchCacheForNavigations,PrivacySandboxSettings,FederatedLearningOfCohorts,AutofillServerCommunication,LiveCaption,InterestCohortAPIOriginTrial,LangClientHintHeader,SubresourceWebBundles,NetworkTimeServiceQuerying,IdleDetection,SignedExchangeSubresourcePrefetch,EnablePasswordsAccountStorage --flag-switches-begin --flag-switches-end
Executable Path /opt/brave.com/brave/brave
Profile Path /home/bill88t/.config/BraveSoftware/Brave-Browser/Default
Variations AdRewardsStudy:NextPaymentDay
EphemeralStorageStudy:Enabled
NativeCosmeticFilteringStudy:Enabled
PermissionLifetimeReleaseStudy:Enabled
SpeedreaderReleaseStudy:Disabled

Miscellaneous Information:

useful output from "journalctl -b -1":

watchdog: BUG: soft lockup - CPU#6 stuck for 23s! [brave:75851]
Modules linked in: xt_recent ccm vboxnetadp(OE) vboxnetflt(OE) vboxdrv(OE) rfcomm nf_conntrack_netlink nfnetlink xfrm_user xfrm_algo br_netfilt>
snd_seq_midi_event btusb snd_rawmidi btrtl libarc4 btbcm btintel kvm_intel snd_seq bluetooth uvcvideo videobuf2_vmalloc iwlwifi kvm videobuf2_>
nvme_core ahci realtek libahci wmi video
CPU: 6 PID: 75851 Comm: brave Tainted: P OEL 5.4.0-81-generic #91-Ubuntu
Hardware name: Micro-Star International Co., Ltd. MS-7C83/B460M PRO-VDH WIFI (MS-7C83), BIOS 1.00 05/13/2020
RIP: 0010:_nv035844rm+0xa0/0xe0 [nvidia]
Code: b8 48 01 00 00 e8 80 5a ff ff 49 8b 4c 24 20 48 89 c2 48 89 ef 48 8d b1 48 01 00 00 4c 89 e9 e8 a6 5b ff ff 66 0f 1f 44 00 00 <48> 89 ef >
RSP: 0018:ffffb31cc27dfb70 EFLAGS: 00000203 ORIG_RAX: ffffffffffffff13
RAX: 0000000000000001 RBX: ffff8986c33cf830 RCX: ffff8987746f5978
RDX: ffffdf995e65a948 RSI: ffffdf995e65c9af RDI: ffff89871472ad20
RBP: ffff89871472ad20 R08: 0000000000000020 R09: ffff89871472ad28
R10: 0000000000000000 R11: 0000000000000001 R12: ffff898770df5af8
R13: ffffdf995a91d327 R14: ffff89871472ad98 R15: ffff8986c33cf830
FS: 00007f342f9265c0(0000) GS:ffff89877e980000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007f70f6fba9d0 CR3: 000000021ca0a002 CR4: 00000000007606e0
55555554
Call Trace:
? _nv014655rm+0x2ee/0x770 [nvidia]
? _nv037695rm+0xb3/0x150 [nvidia]
? _nv037694rm+0x297/0x4e0 [nvidia]
? _nv037689rm+0x60/0x70 [nvidia]
? _nv037690rm+0x7b/0xb0 [nvidia]
? _nv036056rm+0x40/0xe0 [nvidia]
? _nv000699rm+0x68/0x80 [nvidia]
? rm_cleanup_file_private+0xea/0x160 [nvidia]
? nvidia_close+0x149/0x2d0 [nvidia]
? nvidia_frontend_close+0x2f/0x50 [nvidia]
? __fput+0xcc/0x260
? ____fput+0xe/0x10
? task_work_run+0x8f/0xb0
? do_exit+0x36e/0xaf0
? __secure_computing+0x42/0xe0
? syscall_trace_enter+0x134/0x2b0
? do_group_exit+0x47/0xb0
? __x64_sys_exit_group+0x18/0x20
? do_syscall_64+0x57/0x190
? entry_SYSCALL_64_after_hwframe+0x44/0xa9

@mu-aleph
Copy link

This also just happened to me, had one tab opened and was closing Brave.

Kernel Oops, and had to hard reboot:

$ uname -ra
Linux x 5.13.0-28-generic #31~20.04.1-Ubuntu SMP

$ lspci -vvvk | grep -A 2 -i "nvidia" | grep "GTX|Kernel modules"
49:00.0 VGA compatible controller: NVIDIA Corporation GP102 [GeForce GTX 1080 Ti] (rev a1) (prog-if 00 [VGA controller])
Subsystem: Gigabyte Technology Co., Ltd GP102 [GeForce GTX 1080 Ti]
Kernel modules: nvidiafb, nouveau, nvidia_drm, nvidia
4a:00.0 VGA compatible controller: NVIDIA Corporation GP102 [GeForce GTX 1080 Ti] (rev a1) (prog-if 00 [VGA controller])
Subsystem: Gigabyte Technology Co., Ltd GP102 [GeForce GTX 1080 Ti]
Kernel modules: nvidiafb, nouveau, nvidia_drm, nvidia

$ sudo grep -Eo "CPU: [0-9]" /var/log/syslog | sort | uniq -c
1 CPU: 0
5 CPU: 1
4 CPU: 3
27 CPU: 4
26 CPU: 5
1 CPU: 6
27 CPU: 7

$ sudo grep -Eo "Comm: [a-Z]* Tainted:" /var/log/syslog | sort | uniq -c
1 Comm: brave Tainted:
6 Comm: chrome Tainted:
1 Comm: code Tainted:
27 Comm: thermald Tainted:
26 Comm: VizCompositorTh Tainted:

$ brave-browser --version
Brave Browser 98.1.35.101

brave-browser.desktop[362436]: [362474:362492:0217/060029.777537:ERROR:node_controller.cc(585)] Trying to re-add dropped peer C4EA2B8F904A0A12.44B631B1E3F08D06 brave-browser.desktop[362436]: [362476:4:0217/060029.777611:ERROR:node_controller.cc(585)] Trying to re-add dropped peer C4EA2B8F904A0A12.44B631B1E3F08D06 brave-browser.desktop[362436]: [362476:4:0217/060031.505188:ERROR:node_controller.cc(585)] Trying to re-add dropped peer B1ACC5EA2FEA96EC.E48E0A4E57BE17E8 brave-browser.desktop[362436]: [362476:4:0217/060031.520771:ERROR:node_controller.cc(585)] Trying to re-add dropped peer AB8666942D2D6933.459F01C437B3F8D6 systemd[2357]: gnome-launched-brave-browser.desktop-362430.scope: Succeeded. kernel: [562631.165616] BUG: kernel NULL pointer dereference, address: 000000000000000b kernel: [562631.165619] #PF: supervisor read access in kernel mode kernel: [562631.165621] #PF: error_code(0x0000) - not-present page kernel: [562631.165622] PGD 0 P4D 0 kernel: [562631.165623] Oops: 0000 [#1] SMP PTI kernel: [562631.165625] CPU: 0 PID: 362562 Comm: brave Tainted: P W OE 5.13.0-28-generic #31~20.04.1-Ubuntu kernel: [562631.165627] Hardware name: Gigabyte Technology Co., Ltd. Z270X-Gaming 9/Z270X-Gaming 9, BIOS F9f 07/22/2020 kernel: [562631.165628] RIP: 0010:_nv029004rm+0x35/0x90 [nvidia] kernel: [562631.165913] Code: 57 10 31 c0 48 85 d2 74 2e 48 8b 4f 08 31 c0 48 85 c9 74 0d 48 63 41 14 48 89 d6 48 29 c6 48 89 f0 48 3b 57 18 48 89 07 74 1b <48> 8b 42 08 48 89 47 10 b8 01 00 00 00 48 83 c4 08 c3 66 0f 1f 84 kernel: [562631.165915] RSP: 0018:ffffb2364599bac0 EFLAGS: 00010206 kernel: [562631.165916] RAX: 0000000000000003 RBX: ffff92caebeab030 RCX: ffff92ce34c4ad78 kernel: [562631.165917] RDX: 0000000000000003 RSI: 0000000000000003 RDI: ffff92cd72bfad20 kernel: [562631.165918] RBP: ffff92cd72bfad20 R08: 0000000000000020 R09: ffff92cd72bfad28 kernel: [562631.165919] R10: 0000000000000000 R11: 0000000000000001 R12: ffff92ca87e8c498 kernel: [562631.165920] R13: 0000000000000000 R14: ffff92cd72bfad98 R15: ffff92caebeab030 kernel: [562631.165921] FS: 0000000000000000(0000) GS:ffff92d9bea00000(0000) knlGS:0000000000000000 kernel: [562631.165922] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 kernel: [562631.165923] CR2: 000000000000000b CR3: 0000000f1ce10004 CR4: 00000000003706f0 kernel: [562631.165924] Call Trace: kernel: [562631.165925] <TASK> kernel: [562631.165927] ? _nv035891rm+0xa8/0xe0 [nvidia] kernel: [562631.166145] ? _nv014660rm+0x2ee/0x770 [nvidia] kernel: [562631.166357] ? _nv037748rm+0xb3/0x150 [nvidia] kernel: [562631.166570] ? _nv037747rm+0x297/0x4e0 [nvidia] kernel: [562631.166784] ? _nv037742rm+0x60/0x70 [nvidia] kernel: [562631.166994] ? _nv037743rm+0x7b/0xb0 [nvidia] kernel: [562631.167208] ? _nv036103rm+0x40/0xe0 [nvidia] kernel: [562631.167340] ? _nv000699rm+0x68/0x80 [nvidia] kernel: [562631.167518] ? rm_cleanup_file_private+0xea/0x160 [nvidia] kernel: [562631.167697] ? free_unref_page+0x59/0x80 kernel: [562631.167701] ? nvidia_close+0x15f/0x2e0 [nvidia] kernel: [562631.167806] ? nvidia_frontend_close+0x2f/0x50 [nvidia] kernel: [562631.167912] ? __fput+0x9f/0x250 kernel: [562631.167913] ? ____fput+0xe/0x10 kernel: [562631.167915] ? task_work_run+0x70/0xb0 kernel: [562631.167917] ? do_exit+0x37b/0xaf0 kernel: [562631.167919] ? do_group_exit+0x43/0xb0 kernel: [562631.167920] ? __x64_sys_exit_group+0x18/0x20 kernel: [562631.167921] ? do_syscall_64+0x61/0xb0 kernel: [562631.167924] ? syscall_exit_to_user_mode+0x27/0x50 kernel: [562631.167925] ? __x64_sys_close+0x12/0x40 kernel: [562631.167927] ? do_syscall_64+0x6e/0xb0 kernel: [562631.167929] ? syscall_exit_to_user_mode+0x27/0x50 kernel: [562631.167930] ? __x64_sys_close+0x12/0x40 kernel: [562631.167932] ? do_syscall_64+0x6e/0xb0 kernel: [562631.167933] ? do_syscall_64+0x6e/0xb0 kernel: [562631.167935] ? syscall_exit_to_user_mode+0x27/0x50 kernel: [562631.167936] ? __x64_sys_close+0x12/0x40 kernel: [562631.167938] ? do_syscall_64+0x6e/0xb0 kernel: [562631.167939] ? do_syscall_64+0x6e/0xb0 kernel: [562631.167941] ? do_syscall_64+0x6e/0xb0 kernel: [562631.167943] ? asm_sysvec_apic_timer_interrupt+0xa/0x20 kernel: [562631.167945] ? entry_SYSCALL_64_after_hwframe+0x44/0xae kernel: [562631.167947] </TASK> kernel: [562631.168009] CR2: 000000000000000b kernel: [562631.168011] ---[ end trace 142caa8b05a8a754 ]--- kernel: [562631.294158] RIP: 0010:_nv029004rm+0x35/0x90 [nvidia] kernel: [562631.294460] Code: 57 10 31 c0 48 85 d2 74 2e 48 8b 4f 08 31 c0 48 85 c9 74 0d 48 63 41 14 48 89 d6 48 29 c6 48 89 f0 48 3b 57 18 48 89 07 74 1b <48> 8b 42 08 48 89 47 10 b8 01 00 00 00 48 83 c4 08 c3 66 0f 1f 84 kernel: [562631.294462] RSP: 0018:ffffb2364599bac0 EFLAGS: 00010206 kernel: [562631.294464] RAX: 0000000000000003 RBX: ffff92caebeab030 RCX: ffff92ce34c4ad78 kernel: [562631.294465] RDX: 0000000000000003 RSI: 0000000000000003 RDI: ffff92cd72bfad20 kernel: [562631.294466] RBP: ffff92cd72bfad20 R08: 0000000000000020 R09: ffff92cd72bfad28 kernel: [562631.294467] R10: 0000000000000000 R11: 0000000000000001 R12: ffff92ca87e8c498 kernel: [562631.294468] R13: 0000000000000000 R14: ffff92cd72bfad98 R15: ffff92caebeab030 kernel: [562631.294469] FS: 0000000000000000(0000) GS:ffff92d9bea00000(0000) knlGS:0000000000000000 kernel: [562631.294470] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 kernel: [562631.294471] CR2: 000000000000000b CR3: 0000000170f86001 CR4: 00000000003706f0 kernel: [562631.294473] Fixing recursive fault but reboot is needed!

@iefremov
Copy link
Contributor

stale

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: Done
Development

No branches or pull requests

4 participants