Add periodictimer to net/MRP to enable retries #48

noelbk · 2013-09-16T17:33:58Z

Added periodic timer from [802.1Q-2011] to retry if MRP loses messages. MRP used to lose JoinIn messages and never retry if it sent messages before the interface became ready.

The periodic timer from the 802.1Q spec never turns off. It fires every second, causing MRP to bounce from state QA to AA and back. You might think that would stop when the registrar responds with an JoinIn, but there's no state in the MRP state table to ignore the periodic timer after getting a reply. The result is MRP sends a JoinIn message every second. This may not be desirable, but it's what the spec requires.

[802.1Q-2011]
http://standards.ieee.org/findstds/standard/802.1Q-2011.html

torvalds · 2013-09-16T19:51:22Z

This is not how to do kernel development. You need to talk to the
maintainers of the code in question, which in this case would be

./scripts/get_maintainer.pl -f net/802/mrp.c:

"David S. Miller" davem@davemloft.net (maintainer:NETWORKING
[GENERAL],commit_signer:4/4=100%)
David Ward david.ward@ll.mit.edu (commit_signer:2/4=50%)
Patrick McHardy kaber@trash.net (commit_signer:1/4=25%)
Stephen Hemminger stephen@networkplumber.org (commit_signer:1/4=25%)
Eric Dumazet edumazet@google.com (commit_signer:1/4=25%)
netdev@vger.kernel.org (open list:NETWORKING [GENERAL])

and I'm pretty sure that they don't want just some random github pull
request either. Patches, explanations, background. And then when
they trust you, you can use the proper git "request-pull" script to
generate a high-quality pull request that you send from a real email
address, not the random stuff github generates. That said, for a
single small commit, an actual emailed patch is probably the way to
go.

github people - is there really no way to turn off Pull requests like
this and have an explanation pointing to
Documentation/SubmittingPatches or something? The kernel doesn't take
github pull requests, and never will. They don't contain enough
information, and they don't have the proper context.

                 Linus

On Mon, Sep 16, 2013 at 1:34 PM, noelbk notifications@github.com wrote:

Added periodic timer from [802.1Q-2011] to retry if MRP loses messages. MRP
used to lose JoinIn messages and never retry if it sent messages before the
interface became ready.

The periodic timer from the 802.1Q spec never turns off. It fires every
second, causing MRP to bounce from state QA to AA and back. You might think
that would stop when the registrar responds with an JoinIn, but there's no
state in the MRP state table to ignore the periodic timer after getting a
reply. The result is MRP sends a JoinIn message every second. This may not
be desirable, but it's what the spec requires.

[802.1Q-2011]
http://standards.ieee.org/findstds/standard/802.1Q-2011.html

You can merge this Pull Request by running

git pull https://github.com/noelbk/linux master

Or view, comment on, or merge it at:

#48

Commit Summary

added periodictimer to mrp to allow retries when packets get lost
cleaned up debug prints before committing back to master
clean up before pull request

File Changes

M include/net/mrp.h (1)
M net/802/mrp.c (45)

Patch Links:

https://github.com/torvalds/linux/pull/48.patch
https://github.com/torvalds/linux/pull/48.diff

noelbk · 2013-09-16T20:00:18Z

Oops, sorry for wasting your time. I tried looking for a HOWTO to submit
this, but totally missed Documentation/SubmittingPatches. Duh.

Cheers,

Noel

On Mon, Sep 16, 2013 at 12:53 PM, Linus Torvalds
notifications@gitpro.ttaallkk.topwrote:

This is not how to do kernel development. You need to talk to the
maintainers of the code in question, which in this case would be

./scripts/get_maintainer.pl -f net/802/mrp.c:

"David S. Miller" davem@davemloft.net (maintainer:NETWORKING
[GENERAL],commit_signer:4/4=100%)
David Ward david.ward@ll.mit.edu (commit_signer:2/4=50%)
Patrick McHardy kaber@trash.net (commit_signer:1/4=25%)
Stephen Hemminger stephen@networkplumber.org (commit_signer:1/4=25%)
Eric Dumazet edumazet@google.com (commit_signer:1/4=25%)
netdev@vger.kernel.org (open list:NETWORKING [GENERAL])

and I'm pretty sure that they don't want just some random github pull
request either. Patches, explanations, background. And then when
they trust you, you can use the proper git "request-pull" script to
generate a high-quality pull request that you send from a real email
address, not the random stuff github generates. That said, for a
single small commit, an actual emailed patch is probably the way to
go.

github people - is there really no way to turn off Pull requests like
this and have an explanation pointing to
Documentation/SubmittingPatches or something? The kernel doesn't take
github pull requests, and never will. They don't contain enough
information, and they don't have the proper context.

Linus

On Mon, Sep 16, 2013 at 1:34 PM, noelbk notifications@github.com wrote:

Added periodic timer from [802.1Q-2011] to retry if MRP loses messages.
MRP
used to lose JoinIn messages and never retry if it sent messages before
the
interface became ready.

The periodic timer from the 802.1Q spec never turns off. It fires every
second, causing MRP to bounce from state QA to AA and back. You might
think
that would stop when the registrar responds with an JoinIn, but there's
no
state in the MRP state table to ignore the periodic timer after getting a
reply. The result is MRP sends a JoinIn message every second. This may
not
be desirable, but it's what the spec requires.

[802.1Q-2011]
http://standards.ieee.org/findstds/standard/802.1Q-2011.html

You can merge this Pull Request by running

git pull https://github.com/noelbk/linux master

Or view, comment on, or merge it at:

#48

Commit Summary

added periodictimer to mrp to allow retries when packets get lost
cleaned up debug prints before committing back to master
clean up before pull request

File Changes

M include/net/mrp.h (1)
M net/802/mrp.c (45)

Patch Links:

https://github.com/torvalds/linux/pull/48.patch
https://github.com/torvalds/linux/pull/48.diff

—
Reply to this email directly or view it on GitHubhttps://github.com//pull/48#issuecomment-24539030
.

As the new x86 CPU bootup printout format code maintainer, I am taking immediate action to improve and clean (and thus indulge my OCD) the reporting of the cores when coming up online. Fix padding to a right-hand alignment, cleanup code and bind reporting width to the max number of supported CPUs on the system, like this: [ 0.074509] smpboot: Booting Node 0, Processors: #1 #2 #3 #4 #5 torvalds#6 torvalds#7 OK [ 0.644008] smpboot: Booting Node 1, Processors: torvalds#8 torvalds#9 torvalds#10 torvalds#11 torvalds#12 torvalds#13 torvalds#14 torvalds#15 OK [ 1.245006] smpboot: Booting Node 2, Processors: torvalds#16 torvalds#17 torvalds#18 torvalds#19 torvalds#20 torvalds#21 torvalds#22 torvalds#23 OK [ 1.864005] smpboot: Booting Node 3, Processors: torvalds#24 torvalds#25 torvalds#26 torvalds#27 torvalds#28 torvalds#29 torvalds#30 torvalds#31 OK [ 2.489005] smpboot: Booting Node 4, Processors: torvalds#32 torvalds#33 torvalds#34 torvalds#35 torvalds#36 torvalds#37 torvalds#38 torvalds#39 OK [ 3.093005] smpboot: Booting Node 5, Processors: torvalds#40 torvalds#41 torvalds#42 torvalds#43 torvalds#44 torvalds#45 torvalds#46 torvalds#47 OK [ 3.698005] smpboot: Booting Node 6, Processors: torvalds#48 torvalds#49 torvalds#50 torvalds#51 #52 #53 torvalds#54 torvalds#55 OK [ 4.304005] smpboot: Booting Node 7, Processors: torvalds#56 torvalds#57 #58 torvalds#59 torvalds#60 torvalds#61 torvalds#62 torvalds#63 OK [ 4.961413] Brought up 64 CPUs and this: [ 0.072367] smpboot: Booting Node 0, Processors: #1 #2 #3 #4 #5 torvalds#6 torvalds#7 OK [ 0.686329] Brought up 8 CPUs Signed-off-by: Borislav Petkov <bp@suse.de> Cc: Libin <huawei.libin@huawei.com> Cc: wangyijing@huawei.com Cc: fenghua.yu@intel.com Cc: guohanjun@huawei.com Cc: paul.gortmaker@windriver.com Link: http://lkml.kernel.org/r/20130927143554.GF4422@pd.tnic Signed-off-by: Ingo Molnar <mingo@kernel.org>

Turn it into (for example): [ 0.073380] x86: Booting SMP configuration: [ 0.074005] .... node #0, CPUs: #1 #2 #3 #4 #5 torvalds#6 torvalds#7 [ 0.603005] .... node #1, CPUs: torvalds#8 torvalds#9 torvalds#10 torvalds#11 torvalds#12 torvalds#13 torvalds#14 torvalds#15 [ 1.200005] .... node #2, CPUs: torvalds#16 torvalds#17 torvalds#18 torvalds#19 torvalds#20 torvalds#21 torvalds#22 torvalds#23 [ 1.796005] .... node #3, CPUs: torvalds#24 torvalds#25 torvalds#26 torvalds#27 torvalds#28 torvalds#29 torvalds#30 torvalds#31 [ 2.393005] .... node #4, CPUs: torvalds#32 torvalds#33 torvalds#34 torvalds#35 torvalds#36 torvalds#37 torvalds#38 torvalds#39 [ 2.996005] .... node #5, CPUs: torvalds#40 torvalds#41 torvalds#42 torvalds#43 torvalds#44 torvalds#45 torvalds#46 torvalds#47 [ 3.600005] .... node torvalds#6, CPUs: torvalds#48 torvalds#49 torvalds#50 torvalds#51 #52 #53 torvalds#54 torvalds#55 [ 4.202005] .... node torvalds#7, CPUs: torvalds#56 torvalds#57 #58 torvalds#59 torvalds#60 torvalds#61 torvalds#62 torvalds#63 [ 4.811005] .... node torvalds#8, CPUs: torvalds#64 torvalds#65 torvalds#66 torvalds#67 torvalds#68 torvalds#69 #70 torvalds#71 [ 5.421006] .... node torvalds#9, CPUs: torvalds#72 torvalds#73 torvalds#74 torvalds#75 torvalds#76 torvalds#77 torvalds#78 torvalds#79 [ 6.032005] .... node torvalds#10, CPUs: torvalds#80 torvalds#81 torvalds#82 torvalds#83 torvalds#84 torvalds#85 torvalds#86 torvalds#87 [ 6.648006] .... node torvalds#11, CPUs: torvalds#88 torvalds#89 torvalds#90 torvalds#91 torvalds#92 torvalds#93 torvalds#94 torvalds#95 [ 7.262005] .... node torvalds#12, CPUs: torvalds#96 torvalds#97 torvalds#98 torvalds#99 torvalds#100 torvalds#101 torvalds#102 torvalds#103 [ 7.865005] .... node torvalds#13, CPUs: torvalds#104 torvalds#105 torvalds#106 torvalds#107 torvalds#108 torvalds#109 torvalds#110 torvalds#111 [ 8.466005] .... node torvalds#14, CPUs: torvalds#112 torvalds#113 torvalds#114 torvalds#115 torvalds#116 torvalds#117 torvalds#118 torvalds#119 [ 9.073006] .... node torvalds#15, CPUs: torvalds#120 torvalds#121 torvalds#122 torvalds#123 torvalds#124 torvalds#125 torvalds#126 torvalds#127 [ 9.679901] x86: Booted up 16 nodes, 128 CPUs and drop useless elements. Change num_digits() to hpa's division-avoiding, cell-phone-typed version which he went at great lengths and pains to submit on a Saturday evening. Signed-off-by: Borislav Petkov <bp@suse.de> Cc: huawei.libin@huawei.com Cc: wangyijing@huawei.com Cc: fenghua.yu@intel.com Cc: guohanjun@huawei.com Cc: paul.gortmaker@windriver.com Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/20130930095624.GB16383@pd.tnic Signed-off-by: Ingo Molnar <mingo@kernel.org>

This reverts commit 9d046cc. Commit 9d046cc marks all state tables with __initdata, but the state table may be accessed when doing CPU online, which then causing system crash as below: [ 204.188841] BUG: unable to handle kernel paging request at ffffffff8227cce8 [ 204.196844] IP: [<ffffffff814aa1c0>] intel_idle_cpu_init+0x40/0x130 [ 204.203996] PGD 1e11067 PUD 1e12063 PMD 455859063 PTE 800000000227c062 [ 204.211638] Oops: 0000 [#1] SMP DEBUG_PAGEALLOC [ 204.216975] Modules linked in: x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm crct10dif_pclmul crc32_pclmul ghash_clmulni_intel aesni_intel aes_x86_64 lrw gf128mul glue_helper ablk_helper cryptd gpio_ich microcode joydev sb_edac edac_core ipmi_si lpc_ich ipmi_msghandler lp tpm_tis parport wmi mac_hid acpi_pad hid_generic ixgbe isci usbhid dca hid libsas ptp ahci libahci scsi_transport_sas megaraid_sas pps_core mdio [ 204.262815] CPU: 11 PID: 1489 Comm: bash Not tainted 3.13.0-rc7+ torvalds#48 [ 204.269993] Hardware name: Intel Corporation BRICKLAND/BRICKLAND, BIOS BRIVTIN1.86B.0047.L09.1312061514 12/06/2013 [ 204.281646] task: ffff8804303a24a0 ti: ffff880440fac000 task.ti: ffff880440fac000 [ 204.290311] RIP: 0010:[<ffffffff814aa1c0>] [<ffffffff814aa1c0>] intel_idle_cpu_init+0x40/0x130 [ 204.300184] RSP: 0018:ffff880440fadd28 EFLAGS: 00010286 [ 204.306192] RAX: ffffffff8227cca0 RBX: ffffe8fff1a03400 RCX: 0000000000000007 [ 204.314244] RDX: ffff88045f400000 RSI: 0000000000000009 RDI: 0000000000001120 [ 204.322296] RBP: ffff880440fadd38 R08: 0000000000000000 R09: 0000000000000001 [ 204.330411] R10: 0000000000000001 R11: 0000000000000000 R12: 000000000000001e [ 204.338482] R13: 00000000ffffffdb R14: 0000000000000001 R15: 0000000000000000 [ 204.346743] FS: 00007f64f7b0c740(0000) GS:ffff88045ce00000(0000) knlGS:0000000000000000 [ 204.355919] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 204.362449] CR2: ffffffff8227cce8 CR3: 0000000444ab0000 CR4: 00000000001407e0 [ 204.370520] Stack: [ 204.372853] 000000000000001e ffffffff81f10240 ffff880440fadd50 ffffffff814aa307 [ 204.381519] ffffffff81ea80e0 ffff880440fadda0 ffffffff8185a230 0000000000000000 [ 204.390196] 000000000000001e 0000000000000002 0000000000000002 0000000000000000 [ 204.398856] Call Trace: [ 204.401683] [<ffffffff814aa307>] cpu_hotplug_notify+0x57/0x70 [ 204.408638] [<ffffffff8185a230>] notifier_call_chain+0x100/0x150 [ 204.415553] [<ffffffff810a7dae>] __raw_notifier_call_chain+0xe/0x10 [ 204.422772] [<ffffffff81072163>] cpu_notify+0x23/0x50 [ 204.428616] [<ffffffff810723b2>] _cpu_up+0x132/0x1a0 [ 204.434361] [<ffffffff8107249d>] cpu_up+0x7d/0xa0 [ 204.439819] [<ffffffff81836c9c>] cpu_subsys_online+0x3c/0x90 [ 204.446345] [<ffffffff81554625>] device_online+0x45/0xa0 [ 204.452471] [<ffffffff815546ce>] online_store+0x4e/0x80 [ 204.458511] [<ffffffff815519a8>] dev_attr_store+0x18/0x30 [ 204.464744] [<ffffffff812a68f1>] sysfs_write_file+0x151/0x1c0 [ 204.471681] [<ffffffff81217ef1>] vfs_write+0xe1/0x160 [ 204.477524] [<ffffffff8121889c>] SyS_write+0x4c/0x90 [ 204.483270] [<ffffffff8185f2ed>] system_call_fastpath+0x1a/0x1f [ 204.490081] Code: 41 54 41 89 fc 8b 3d 48 25 85 01 53 48 8b 1d 30 25 85 01 48 03 1c c5 40 90 fb 81 48 8b 05 19 25 85 01 c7 43 0c 01 00 00 00 66 90 <48> 83 78 48 00 74 4f 41 83 c0 01 41 39 f0 7e 10 48 c7 c7 38 79 [ 204.515723] RIP [<ffffffff814aa1c0>] intel_idle_cpu_init+0x40/0x130 [ 204.522996] RSP <ffff880440fadd28> [ 204.526976] CR2: ffffffff8227cce8 [ 204.530766] ---[ end trace 336f56cc3d1cfc8c ]--- Fixes: 9d046cc (intel_idle: mark states tables with __initdata tag) Signed-off-by: Jiang Liu <jiang.liu@linux.intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>

…atch-fixes WARNING: Missing a blank line after declarations torvalds#48: FILE: fs/notify/fanotify/fanotify_user.c:495: + __u32 tmask = fsn_mark->mask & ~mask; + if (flags & FAN_MARK_ONDIR) WARNING: Missing a blank line after declarations torvalds#67: FILE: fs/notify/fanotify/fanotify_user.c:579: + __u32 tmask = fsn_mark->mask | mask; + if (flags & FAN_MARK_ONDIR) total: 0 errors, 2 warnings, 54 lines checked ./patches/fanotify-dont-set-fan_ondir-implicitly-on-a-marks-ignored-mask.patch has style problems, please review. If any of these errors are false positives, please report them to the maintainer, see CHECKPATCH in MAINTAINERS. Please run checkpatch prior to sending patches Cc: Lino Sanfilippo <LinoSanfilippo@gmx.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

Sumeone needs to buy a tab key. WARNING: please, no spaces at the start of a line torvalds#29: FILE: security/tomoyo/util.c:951: + struct file *exe_file;$ WARNING: please, no spaces at the start of a line torvalds#30: FILE: security/tomoyo/util.c:952: + const char *cp;$ WARNING: please, no spaces at the start of a line torvalds#31: FILE: security/tomoyo/util.c:953: + struct mm_struct *mm = current->mm;$ WARNING: please, no spaces at the start of a line torvalds#40: FILE: security/tomoyo/util.c:955: + if (!mm)$ WARNING: suspect code indent for conditional statements (7, 15) torvalds#40: FILE: security/tomoyo/util.c:955: + if (!mm) + return NULL; WARNING: please, no spaces at the start of a line torvalds#42: FILE: security/tomoyo/util.c:957: + exe_file = get_mm_exe_file(mm);$ WARNING: please, no spaces at the start of a line torvalds#43: FILE: security/tomoyo/util.c:958: + if (!exe_file)$ WARNING: suspect code indent for conditional statements (7, 15) torvalds#43: FILE: security/tomoyo/util.c:958: + if (!exe_file) + return NULL; WARNING: please, no spaces at the start of a line torvalds#46: FILE: security/tomoyo/util.c:961: + cp = tomoyo_realpath_from_path(&exe_file->f_path);$ WARNING: please, no spaces at the start of a line torvalds#47: FILE: security/tomoyo/util.c:962: + fput(exe_file);$ WARNING: please, no spaces at the start of a line torvalds#48: FILE: security/tomoyo/util.c:963: + return cp;$ total: 0 errors, 11 warnings, 28 lines checked ./patches/tomoyo-reduce-mmap_sem-hold-for-mm-exe_file.patch has style problems, please review. If any of these errors are false positives, please report them to the maintainer, see CHECKPATCH in MAINTAINERS. Please run checkpatch prior to sending patches Cc: Davidlohr Bueso <dbueso@suse.de> Cc: James Morris <jmorris@namei.org> Cc: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

Commit 297d716 ("power_supply: Change ownership from driver to core") inverted the logic in battery_notify(). As an effect already present battery was re-added on each system suspend or hibernation. WARNING: CPU: 0 PID: 303 at ../fs/sysfs/dir.c:31 sysfs_warn_dup+0x68/0x80() sysfs: cannot create duplicate filename '/devices/LNXSYSTM:00/LNXSYBUS:00/PNP0A03:00/PNP0C0A:00/power_supply/BAT0' CPU: 0 PID: 303 Comm: rtcwake Not tainted 4.0.0-ARCH-02621-g07e6253af953 #48 Call Trace: sysfs_create_dir_ns+0x8d/0xa0 kobject_add_internal+0xb6/0x370 kobject_add+0x6f/0xd0 device_add+0x120/0x6c0 __power_supply_register+0x145/0x290 power_supply_register_no_ws+0x10/0x20 sysfs_add_battery+0x84/0xc5 [battery] battery_notify+0x45/0x6b [battery] notifier_call_chain+0x4f/0x80 __blocking_notifier_call_chain+0x4b/0x70 blocking_notifier_call_chain+0x16/0x20 pm_notifier_call_chain+0x1a/0x40 pm_suspend+0x3ed/0x4e0 Signed-off-by: Krzysztof Kozlowski <k.kozlowski.k@gmail.com> Reported-by: Linus Torvalds <torvalds@linux-foundation.org> Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Reviewed-By: Sebastian Reichel <sre@kernel.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

Ci20 v3.18 Add audio over hdmi + mono sound fixes

the returned buffer of register_sysctl() is stored into net_header variable, but net_header is not used after, and compiler maybe optimise the variable out, and lead kmemleak reported the below warning comm "swapper/0", pid 1, jiffies 4294937448 (age 267.270s) hex dump (first 32 bytes): 90 38 8b 01 c0 ff ff ff 00 00 00 00 01 00 00 00 .8.............. 01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ backtrace: [<ffffffc00020f134>] create_object+0x10c/0x2a0 [<ffffffc00070ff44>] kmemleak_alloc+0x54/0xa0 [<ffffffc0001fe378>] __kmalloc+0x1f8/0x4f8 [<ffffffc00028e984>] __register_sysctl_table+0x64/0x5a0 [<ffffffc00028eef0>] register_sysctl+0x30/0x40 [<ffffffc00099c304>] net_sysctl_init+0x20/0x58 [<ffffffc000994dd8>] sock_init+0x10/0xb0 [<ffffffc0000842e0>] do_one_initcall+0x90/0x1b8 [<ffffffc000966bac>] kernel_init_freeable+0x218/0x2f0 [<ffffffc00070ed6c>] kernel_init+0x1c/0xe8 [<ffffffc000083bfc>] ret_from_fork+0xc/0x50 [<ffffffffffffffff>] 0xffffffffffffffff <<end check kmemleak>> Before fix, the objdump result on ARM64: 0000000000000000 <net_sysctl_init>: 0: a9be7bfd stp x29, x30, [sp,#-32]! 4: 90000001 adrp x1, 0 <net_sysctl_init> 8: 90000000 adrp x0, 0 <net_sysctl_init> c: 910003fd mov x29, sp 10: 91000021 add x1, x1, #0x0 14: 91000000 add x0, x0, #0x0 18: a90153f3 stp x19, x20, [sp,torvalds#16] 1c: 12800174 mov w20, #0xfffffff4 // #-12 20: 94000000 bl 0 <register_sysctl> 24: b4000120 cbz x0, 48 <net_sysctl_init+0x48> 28: 90000013 adrp x19, 0 <net_sysctl_init> 2c: 91000273 add x19, x19, #0x0 30: 9101a260 add x0, x19, #0x68 34: 94000000 bl 0 <register_pernet_subsys> 38: 2a0003f4 mov w20, w0 3c: 35000060 cbnz w0, 48 <net_sysctl_init+0x48> 40: aa1303e0 mov x0, x19 44: 94000000 bl 0 <register_sysctl_root> 48: 2a1403e0 mov w0, w20 4c: a94153f3 ldp x19, x20, [sp,torvalds#16] 50: a8c27bfd ldp x29, x30, [sp],torvalds#32 54: d65f03c0 ret After: 0000000000000000 <net_sysctl_init>: 0: a9bd7bfd stp x29, x30, [sp,#-48]! 4: 90000000 adrp x0, 0 <net_sysctl_init> 8: 910003fd mov x29, sp c: a90153f3 stp x19, x20, [sp,torvalds#16] 10: 90000013 adrp x19, 0 <net_sysctl_init> 14: 91000000 add x0, x0, #0x0 18: 91000273 add x19, x19, #0x0 1c: f90013f5 str x21, [sp,torvalds#32] 20: aa1303e1 mov x1, x19 24: 12800175 mov w21, #0xfffffff4 // #-12 28: 94000000 bl 0 <register_sysctl> 2c: f9002260 str x0, [x19,torvalds#64] 30: b40001c0 cbz x0, 68 <net_sysctl_init+0x68> 34: 90000014 adrp x20, 0 <net_sysctl_init> 38: 91000294 add x20, x20, #0x0 3c: 9101a280 add x0, x20, #0x68 40: 94000000 bl 0 <register_pernet_subsys> 44: 2a0003f5 mov w21, w0 48: 35000080 cbnz w0, 58 <net_sysctl_init+0x58> 4c: aa1403e0 mov x0, x20 50: 94000000 bl 0 <register_sysctl_root> 54: 14000005 b 68 <net_sysctl_init+0x68> 58: f9402260 ldr x0, [x19,torvalds#64] 5c: 94000000 bl 0 <unregister_sysctl_table> 60: f9402260 ldr x0, [x19,torvalds#64] 64: 94000000 bl 0 <kfree> 68: 2a1503e0 mov w0, w21 6c: f94013f5 ldr x21, [sp,torvalds#32] 70: a94153f3 ldp x19, x20, [sp,torvalds#16] 74: a8c37bfd ldp x29, x30, [sp],torvalds#48 78: d65f03c0 ret Add the possible error handle to free the net_header to remove the kmemleak warning Signed-off-by: Li RongQing <roy.qing.li@gmail.com>

the returned buffer of register_sysctl() is stored into net_header variable, but net_header is not used after, and compiler maybe optimise the variable out, and lead kmemleak reported the below warning comm "swapper/0", pid 1, jiffies 4294937448 (age 267.270s) hex dump (first 32 bytes): 90 38 8b 01 c0 ff ff ff 00 00 00 00 01 00 00 00 .8.............. 01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ backtrace: [<ffffffc00020f134>] create_object+0x10c/0x2a0 [<ffffffc00070ff44>] kmemleak_alloc+0x54/0xa0 [<ffffffc0001fe378>] __kmalloc+0x1f8/0x4f8 [<ffffffc00028e984>] __register_sysctl_table+0x64/0x5a0 [<ffffffc00028eef0>] register_sysctl+0x30/0x40 [<ffffffc00099c304>] net_sysctl_init+0x20/0x58 [<ffffffc000994dd8>] sock_init+0x10/0xb0 [<ffffffc0000842e0>] do_one_initcall+0x90/0x1b8 [<ffffffc000966bac>] kernel_init_freeable+0x218/0x2f0 [<ffffffc00070ed6c>] kernel_init+0x1c/0xe8 [<ffffffc000083bfc>] ret_from_fork+0xc/0x50 [<ffffffffffffffff>] 0xffffffffffffffff <<end check kmemleak>> Before fix, the objdump result on ARM64: 0000000000000000 <net_sysctl_init>: 0: a9be7bfd stp x29, x30, [sp,#-32]! 4: 90000001 adrp x1, 0 <net_sysctl_init> 8: 90000000 adrp x0, 0 <net_sysctl_init> c: 910003fd mov x29, sp 10: 91000021 add x1, x1, #0x0 14: 91000000 add x0, x0, #0x0 18: a90153f3 stp x19, x20, [sp,torvalds#16] 1c: 12800174 mov w20, #0xfffffff4 // #-12 20: 94000000 bl 0 <register_sysctl> 24: b4000120 cbz x0, 48 <net_sysctl_init+0x48> 28: 90000013 adrp x19, 0 <net_sysctl_init> 2c: 91000273 add x19, x19, #0x0 30: 9101a260 add x0, x19, #0x68 34: 94000000 bl 0 <register_pernet_subsys> 38: 2a0003f4 mov w20, w0 3c: 35000060 cbnz w0, 48 <net_sysctl_init+0x48> 40: aa1303e0 mov x0, x19 44: 94000000 bl 0 <register_sysctl_root> 48: 2a1403e0 mov w0, w20 4c: a94153f3 ldp x19, x20, [sp,torvalds#16] 50: a8c27bfd ldp x29, x30, [sp],torvalds#32 54: d65f03c0 ret After: 0000000000000000 <net_sysctl_init>: 0: a9bd7bfd stp x29, x30, [sp,#-48]! 4: 90000000 adrp x0, 0 <net_sysctl_init> 8: 910003fd mov x29, sp c: a90153f3 stp x19, x20, [sp,torvalds#16] 10: 90000013 adrp x19, 0 <net_sysctl_init> 14: 91000000 add x0, x0, #0x0 18: 91000273 add x19, x19, #0x0 1c: f90013f5 str x21, [sp,torvalds#32] 20: aa1303e1 mov x1, x19 24: 12800175 mov w21, #0xfffffff4 // #-12 28: 94000000 bl 0 <register_sysctl> 2c: f9002260 str x0, [x19,torvalds#64] 30: b40001a0 cbz x0, 64 <net_sysctl_init+0x64> 34: 90000014 adrp x20, 0 <net_sysctl_init> 38: 91000294 add x20, x20, #0x0 3c: 9101a280 add x0, x20, #0x68 40: 94000000 bl 0 <register_pernet_subsys> 44: 2a0003f5 mov w21, w0 48: 35000080 cbnz w0, 58 <net_sysctl_init+0x58> 4c: aa1403e0 mov x0, x20 50: 94000000 bl 0 <register_sysctl_root> 54: 14000004 b 64 <net_sysctl_init+0x64> 58: f9402260 ldr x0, [x19,torvalds#64] 5c: 94000000 bl 0 <unregister_sysctl_table> 60: f900227f str xzr, [x19,torvalds#64] 64: 2a1503e0 mov w0, w21 68: f94013f5 ldr x21, [sp,torvalds#32] 6c: a94153f3 ldp x19, x20, [sp,torvalds#16] 70: a8c37bfd ldp x29, x30, [sp],torvalds#48 74: d65f03c0 ret Add the possible error handle to free the net_header to remove the kmemleak warning Signed-off-by: Li RongQing <roy.qing.li@gmail.com>

the returned buffer of register_sysctl() is stored into net_header variable, but net_header is not used after, and compiler maybe optimise the variable out, and lead kmemleak reported the below warning comm "swapper/0", pid 1, jiffies 4294937448 (age 267.270s) hex dump (first 32 bytes): 90 38 8b 01 c0 ff ff ff 00 00 00 00 01 00 00 00 .8.............. 01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ backtrace: [<ffffffc00020f134>] create_object+0x10c/0x2a0 [<ffffffc00070ff44>] kmemleak_alloc+0x54/0xa0 [<ffffffc0001fe378>] __kmalloc+0x1f8/0x4f8 [<ffffffc00028e984>] __register_sysctl_table+0x64/0x5a0 [<ffffffc00028eef0>] register_sysctl+0x30/0x40 [<ffffffc00099c304>] net_sysctl_init+0x20/0x58 [<ffffffc000994dd8>] sock_init+0x10/0xb0 [<ffffffc0000842e0>] do_one_initcall+0x90/0x1b8 [<ffffffc000966bac>] kernel_init_freeable+0x218/0x2f0 [<ffffffc00070ed6c>] kernel_init+0x1c/0xe8 [<ffffffc000083bfc>] ret_from_fork+0xc/0x50 [<ffffffffffffffff>] 0xffffffffffffffff <<end check kmemleak>> Before fix, the objdump result on ARM64: 0000000000000000 <net_sysctl_init>: 0: a9be7bfd stp x29, x30, [sp,#-32]! 4: 90000001 adrp x1, 0 <net_sysctl_init> 8: 90000000 adrp x0, 0 <net_sysctl_init> c: 910003fd mov x29, sp 10: 91000021 add x1, x1, #0x0 14: 91000000 add x0, x0, #0x0 18: a90153f3 stp x19, x20, [sp,torvalds#16] 1c: 12800174 mov w20, #0xfffffff4 // #-12 20: 94000000 bl 0 <register_sysctl> 24: b4000120 cbz x0, 48 <net_sysctl_init+0x48> 28: 90000013 adrp x19, 0 <net_sysctl_init> 2c: 91000273 add x19, x19, #0x0 30: 9101a260 add x0, x19, #0x68 34: 94000000 bl 0 <register_pernet_subsys> 38: 2a0003f4 mov w20, w0 3c: 35000060 cbnz w0, 48 <net_sysctl_init+0x48> 40: aa1303e0 mov x0, x19 44: 94000000 bl 0 <register_sysctl_root> 48: 2a1403e0 mov w0, w20 4c: a94153f3 ldp x19, x20, [sp,torvalds#16] 50: a8c27bfd ldp x29, x30, [sp],torvalds#32 54: d65f03c0 ret After: 0000000000000000 <net_sysctl_init>: 0: a9bd7bfd stp x29, x30, [sp,#-48]! 4: 90000000 adrp x0, 0 <net_sysctl_init> 8: 910003fd mov x29, sp c: a90153f3 stp x19, x20, [sp,torvalds#16] 10: 90000013 adrp x19, 0 <net_sysctl_init> 14: 91000000 add x0, x0, #0x0 18: 91000273 add x19, x19, #0x0 1c: f90013f5 str x21, [sp,torvalds#32] 20: aa1303e1 mov x1, x19 24: 12800175 mov w21, #0xfffffff4 // #-12 28: 94000000 bl 0 <register_sysctl> 2c: f9002260 str x0, [x19,torvalds#64] 30: b40001a0 cbz x0, 64 <net_sysctl_init+0x64> 34: 90000014 adrp x20, 0 <net_sysctl_init> 38: 91000294 add x20, x20, #0x0 3c: 9101a280 add x0, x20, #0x68 40: 94000000 bl 0 <register_pernet_subsys> 44: 2a0003f5 mov w21, w0 48: 35000080 cbnz w0, 58 <net_sysctl_init+0x58> 4c: aa1403e0 mov x0, x20 50: 94000000 bl 0 <register_sysctl_root> 54: 14000004 b 64 <net_sysctl_init+0x64> 58: f9402260 ldr x0, [x19,torvalds#64] 5c: 94000000 bl 0 <unregister_sysctl_table> 60: f900227f str xzr, [x19,torvalds#64] 64: 2a1503e0 mov w0, w21 68: f94013f5 ldr x21, [sp,torvalds#32] 6c: a94153f3 ldp x19, x20, [sp,torvalds#16] 70: a8c37bfd ldp x29, x30, [sp],torvalds#48 74: d65f03c0 ret Add the possible error handle to free the net_header to remove the kmemleak warning Signed-off-by: Li RongQing <roy.qing.li@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>

I get the splat below when modprobing/rmmoding EDAC drivers. It happens because bus->name is invalid after bus_unregister() has run. The Code: section below corresponds to: .loc 1 1108 0 movq 672(%rbx), %rax # mci_1(D)->bus, mci_1(D)->bus .loc 1 1109 0 popq %rbx # .loc 1 1108 0 movq (%rax), %rdi # _7->name, jmp kfree # and %rax has some funky stuff 2030203020312030 which looks a lot like something walked over it. Fix that by saving the name ptr before doing stuff to string it points to. general protection fault: 0000 [#1] SMP Modules linked in: ... CPU: 4 PID: 10318 Comm: modprobe Tainted: G I EN 3.12.51-11-default+ torvalds#48 Hardware name: HP ProLiant DL380 G7, BIOS P67 05/05/2011 task: ffff880311320280 ti: ffff88030da3e000 task.ti: ffff88030da3e000 RIP: 0010:[<ffffffffa019da92>] [<ffffffffa019da92>] edac_unregister_sysfs+0x22/0x30 [edac_core] RSP: 0018:ffff88030da3fe28 EFLAGS: 00010292 RAX: 2030203020312030 RBX: ffff880311b4e000 RCX: 000000000000095c RDX: 0000000000000001 RSI: ffff880327bb9600 RDI: 0000000000000286 RBP: ffff880311b4e750 R08: 0000000000000000 R09: ffffffff81296110 R10: 0000000000000400 R11: 0000000000000000 R12: ffff88030ba1ac68 R13: 0000000000000001 R14: 00000000011b02f0 R15: 0000000000000000 FS: 00007fc9bf8f5700(0000) GS:ffff8801a7c40000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b CR2: 0000000000403c90 CR3: 000000019ebdf000 CR4: 00000000000007e0 Stack: Call Trace: i7core_unregister_mci.isra.9 i7core_remove pci_device_remove __device_release_driver driver_detach bus_remove_driver pci_unregister_driver i7core_exit SyS_delete_module system_call_fastpath 0x7fc9bf426536 Code: 2e 0f 1f 84 00 00 00 00 00 66 66 66 66 90 53 48 89 fb e8 52 2a 1f e1 48 8b bb a0 02 00 00 e8 46 59 1f e1 48 8b 83 a0 02 00 00 5b <48> 8b 38 e9 26 9a fe e0 66 0f 1f 44 00 00 66 66 66 66 90 48 8b RIP [<ffffffffa019da92>] edac_unregister_sysfs+0x22/0x30 [edac_core] RSP <ffff88030da3fe28> Signed-off-by: Borislav Petkov <bp@suse.de> Cc: Mauro Carvalho Chehab <mchehab@osg.samsung.com> Cc: <stable@vger.kernel.org> # v3.6.. Fixes: 7a623c0 ("edac: rewrite the sysfs code to use struct device")

GIT 4e757c85baca72dda85112d53a2d8e799a640f48 commit 4675390a9e7183bf45590e84a183e22e32c485a7 Author: Geert Uytterhoeven <geert@linux-m68k.org> Date: Mon Dec 7 10:09:06 2015 +0100 ethernet: aurora: AURORA_NB8800 should depend on HAS_DMA If NO_DMA=y: ERROR: "dma_map_single" [drivers/net/ethernet/aurora/nb8800.ko] undefined! ERROR: "dma_unmap_page" [drivers/net/ethernet/aurora/nb8800.ko] undefined! ERROR: "dma_sync_single_for_cpu" [drivers/net/ethernet/aurora/nb8800.ko] undefined! ERROR: "dma_unmap_single" [drivers/net/ethernet/aurora/nb8800.ko] undefined! ERROR: "dma_alloc_coherent" [drivers/net/ethernet/aurora/nb8800.ko] undefined! ERROR: "dma_mapping_error" [drivers/net/ethernet/aurora/nb8800.ko] undefined! ERROR: "dma_map_page" [drivers/net/ethernet/aurora/nb8800.ko] undefined! ERROR: "dma_free_coherent" [drivers/net/ethernet/aurora/nb8800.ko] undefined! Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org> Acked-by: Mans Rullgard <mans@mansr.com> Signed-off-by: David S. Miller <davem@davemloft.net> commit 668dda06d48fc16a5b40e6a32057bd18589e3f95 Author: Sunil Goutham <sgoutham@cavium.com> Date: Mon Dec 7 10:30:33 2015 +0530 net, thunderx: Remove unnecessary rcv buffer start address management Since we have moved on to using allocated pages to carve receive buffers instead of netdev_alloc_skb() there is no need to store any pointers for later retrieval. Earlier we had to store skb and skb->data pointers which later are used to handover received packet to network stack. This will avoid an unnecessary cache miss as well. Signed-off-by: Sunil Goutham <sgoutham@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net> commit b45ceb406e4fd3045180b8d70bff60b1d43c7ff4 Author: Yury Norov <yury.norov@auriga.com> Date: Mon Dec 7 10:30:32 2015 +0530 net: thunderx: nicvf_queues: nivc_*_intr: remove duplication The same switch-case repeates for nivc_*_intr functions. In this patch it is moved to a helper nicvf_int_type_to_mask(). By the way: - Unneeded write to NICVF register dropped if int_type is unknown. - netdev_dbg() is used instead of netdev_err(). Signed-off-by: Yury Norov <yury.norov@auriga.com> Signed-off-by: Aleksey Makarov <aleksey.makarov@caviumnetworks.com> Acked-by: Vadim Lomovtsev <Vadim.Lomovtsev@caiumnetworks.com> Signed-off-by: Sunil Goutham <sgoutham@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net> commit 2dc487b6ebe8593d694034680389579edc7c1bd6 Author: Andrew Lunn <andrew@lunn.ch> Date: Wed Dec 2 00:33:32 2015 +0100 ARM: mvebu: update v5 defconfig for Orion5x machines Now that Orion5x is part of the multiarch kernel, add it to mvebu_v5_defconfig. Signed-off-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: Gregory CLEMENT <gregory.clement@free-electrons.com> commit 5ff2a92d4e131a846fdb0aa91b50c46efc7d94c9 Author: Andrew Lunn <andrew@lunn.ch> Date: Wed Dec 2 00:33:31 2015 +0100 ARM: mvebu: Reenable DSA in mvebu_v5_defconfig DSA now depends on switchdev. Enable it, and re-enable DSA and its drivers, which were removed when mvebu_v5_defconfig was regenerated. Signed-off-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: Gregory CLEMENT <gregory.clement@free-electrons.com> commit f549707af5650162434a8627c95482a829555450 Author: Borislav Petkov <bp@suse.de> Date: Mon Nov 30 19:02:01 2015 +0100 EDAC: Rework workqueue handling Hide the EDAC workqueue pointer in a separate compilation unit and add accessors for the workqueue manipulations needed. Remove edac_pci_reset_delay_period() which wasn't used by anything. It seems it got added without a user with 91b99041c1d5 ("drivers/edac: updated PCI monitoring") Signed-off-by: Borislav Petkov <bp@suse.de> commit 56fb0cc3b7b6a02b90f2e3457702f014259b1e24 Author: Dmitry Eremin-Solenikov <dbaryshkov@gmail.com> Date: Tue Nov 17 04:38:15 2015 +0300 video: fbdev: rivafb: unlock chip before probiding EDID At least NV3 requires for chip to be unlocked before it is possible to access I2C registers. Without it, it is not possible to read EDID. Signed-off-by: Dmitry Eremin-Solenikov <dbaryshkov@gmail.com> Signed-off-by: Tomi Valkeinen <tomi.valkeinen@ti.com> commit c5fec49f4208f67ef4574af956091093b11c566a Author: Arnd Bergmann <arnd@arndb.de> Date: Fri Nov 20 22:48:36 2015 +0100 fbdev: sm712fb: avoid unused function warnings The sm712fb framebuffer driver encloses the power-management functions in #ifdef CONFIG_PM, but the smtcfb_pci_suspend/resume functions are only really used when CONFIG_PM_SLEEP is also set, as a frequent gcc warning shows: fbdev/sm712fb.c:1549:12: warning: 'smtcfb_pci_suspend' defined but not used fbdev/sm712fb.c:1572:12: warning: 'smtcfb_pci_resume' defined but not used The driver also avoids using the SIMPLE_DEV_PM_OPS macro when CONFIG_PM is unset, which is redundant. This changes the driver to remove the #ifdef and instead mark the functions as __maybe_unused, which is a nicer anyway, as it provides build testing for all the code in all configurations and is harder to get wrong. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Tomi Valkeinen <tomi.valkeinen@ti.com> commit 0f08cf9f7bdf80ebe1ec366de43cd251e39d1e2d Author: Arnd Bergmann <arnd@arndb.de> Date: Fri Nov 20 22:47:41 2015 +0100 fbdev: auo_k190x: avoid unused function warnings The auo_k190x framebuffer driver encloses the power-management functions in #ifdef CONFIG_PM, but the auok190x_suspend/resume functions are only really used when CONFIG_PM_SLEEP is also set, as a frequent gcc warning shows: drivers/video/fbdev/auo_k190x.c:859:12: warning: 'auok190x_suspend' defined but not used drivers/video/fbdev/auo_k190x.c:899:12: warning: 'auok190x_resume' defined but not used This changes the driver to remove the #ifdef and instead mark the functions as __maybe_unused, which is a nicer anyway, as it provides build testing for all the code in all configurations and is harder to get wrong. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Tomi Valkeinen <tomi.valkeinen@ti.com> commit 2a48fd5f89910ed6ed45a953f886de249b496b6c Author: Tejun Heo <tj@kernel.org> Date: Mon Dec 7 10:58:57 2015 -0500 workqueue: warn if memory reclaim tries to flush !WQ_MEM_RECLAIM workqueue Task or work item involved in memory reclaim trying to flush a non-WQ_MEM_RECLAIM workqueue or one of its work items can lead to deadlock. Trigger WARN_ONCE() if such conditions are detected. Signed-off-by: Tejun Heo <tj@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> commit 1add6cb735c5a8e9f411d16680b60c0b6a66de49 Author: Arnd Bergmann <arnd@arndb.de> Date: Fri Nov 27 15:33:11 2015 +0100 fbdev: sis: enforce selection of at least one backend The sis framebuffer driver complains with a compile-time warning if neither the FB_SIS_300 nor FB_SIS_315 symbols are selected: drivers/video/fbdev/sis/sis_main.c:61:2: warning: #warning Neither CONFIG_FB_SIS_300 nor CONFIG_FB_SIS_315 is se This is reasonable because it doesn't work in that case, but it's also annoying for randconfig builds and is one of the most common warnings I'm seeing on ARM now. This changes the Kconfig logic to prevent the silly configuration, by always selecting the FB_SIS_300 variant if the other one is not set. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Tomi Valkeinen <tomi.valkeinen@ti.com> commit bfc98c9c2e0ced00b79771bb65f1a79fd02041dd Author: Dan Carpenter <dan.carpenter@oracle.com> Date: Fri Dec 4 16:14:58 2015 +0300 OMAPDSS: DSS: fix a warning message The WARN() macro has to take a condition. The current code will just print the stack trace and the function name instead of the intended warning message. Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Tomi Valkeinen <tomi.valkeinen@ti.com> commit b79f09261174f0782f6d6b7b3fe48eb4b0ec9d06 Author: Geert Uytterhoeven <geert+renesas@glider.be> Date: Fri Dec 4 17:01:43 2015 +0100 fbdev: Remove unused SH-Mobile HDMI driver As of commit 44d88c754e57a6d9 ("ARM: shmobile: Remove legacy SoC code for R-Mobile A1"), the SH-Mobile HDMI driver is no longer used. In theory it could still be used on R-Mobile A1 SoCs, but that requires adding DT support to the driver, which is not planned. Remove the driver, it can be resurrected from git history when needed. Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be> Acked-by: Simon Horman <horms+renesas@verge.net.au> Signed-off-by: Tomi Valkeinen <tomi.valkeinen@ti.com> commit e11362bb25d97ea1cbe9e3b1e5f3d32aa4e75e13 Author: Yuan Sun <sunyuan3@huawei.com> Date: Mon Dec 7 10:28:46 2015 -0500 Subject: cgroup: Fix incomplete dd command in blkio documentation Signed-off-by: Yuan Sun <sunyuan3@huawei.com> Signed-off-by: Tejun Heo <tj@kernel.org> commit 2286bbcb6bd3823311279f04a7962941f94f8c58 Author: Michael S. Tsirkin <mst@redhat.com> Date: Sun Dec 6 13:31:30 2015 +0200 virtio: drop heuristics on ring full Old hypervisors used tricks for selecting signalling, but modern ones use event index to detect ring full. Drop heuristics in this case. Signed-off-by: Michael S. Tsirkin <mst@redhat.com> commit a05c3fb05e2e46e46eeca5aff2d387d63f58665a Author: Michael S. Tsirkin <mst@redhat.com> Date: Sun Dec 6 13:30:59 2015 +0200 virtio_ring: check used idx correctly on interrupt squash ino RING_POLL Signed-off-by: Michael S. Tsirkin <mst@redhat.com> commit 5d04cdb6377659237591e932086128b22e2047e8 Author: Michael S. Tsirkin <mst@redhat.com> Date: Sun Nov 29 12:44:43 2015 +0200 vring_bench: simple vring benchmark with poll support Integrate the benchmark that Rusty wrote. Add ring poll support. Signed-off-by: Michael S. Tsirkin <mst@redhat.com> commit 7b064765a864030d8f65ad6d2a274b39591418f2 Author: Michael S. Tsirkin <mst@redhat.com> Date: Mon Nov 30 11:13:18 2015 +0200 virtio: skip avail/used index reads This adds a new vring feature bit: when enabled, host and guest poll the available/used ring directly instead of looking at the index field first. To guarantee it is possible to detect updates, the high bits (above vring.num - 1) in the ring head ID value are modified to match the index bits - these change on each wrap-around. Writer also XORs this with 0x8000 such that rings can be zero-initialized. Reader is modified to ignore these high bits when looking up descriptors. The point is to reduce the number of cacheline misses for both reads and writes. I see a performance improvement of about 20% on multithreaded benchmarks (e.g. virtio-test), but regression of about 2% on vring_bench. I think this has to do with the fact that complete_multi_user is implemented suboptimally. TODO: investigate single-threaded regression look at more aggressive ring layout changes better name for a feature flag split the patch to make it easier to review This is on top of the following patches in my tree: virtio_ring: Shadow available ring flags & index vhost: replace % with & on data path tools/virtio: fix byteswap logic tools/virtio: move list macro stubs Signed-off-by: Michael S. Tsirkin <mst@redhat.com> commit 8ce47633e652f2df54b7495a716bc4b5f4574da1 Author: Borislav Petkov <bp@suse.de> Date: Mon Nov 30 15:07:28 2015 +0100 EDAC: Make edac_device workqueue setup/teardown functions static They're not used anywhere else. Signed-off-by: Borislav Petkov <bp@suse.de> commit 4f2568f5cb475529aa2894adc7c7912517c83cb0 Author: Andreas Werner <andreas.werner@men.de> Date: Fri Dec 4 18:14:14 2015 +0100 ata/sata_fsl.c: add ATA_FLAG_NO_LOG_PAGE to blacklist the controller for log page reads Every attempt to issue a read log page command lockup the controller. The command is currently sent if the sata device includes the devlsp feature to read out the timing data. This attempt to read the data, locks up the controller and the device is not recognzied correctly (failed to set xfermode) and cannot be accessed. This was found on Freescale P1013/P1022 and T4240 CPUs using a ATP IG mSATA 4GB with the devslp feature. fsl-sata ff718000.sata: Sata FSL Platform/CSB Driver init [ 1.254195] scsi0 : sata_fsl [ 1.256004] ata1: SATA max UDMA/133 irq 74 [ 1.370666] fsl-gianfar ethernet.3: enabled errata workarounds, flags: 0x4 [ 1.470671] fsl-gianfar ethernet.4: enabled errata workarounds, flags: 0x4 [ 1.775584] ata1: Signature Update detected @ 504 msecs [ 1.947594] ata1: SATA link up 3.0 Gbps (SStatus 123 SControl 300) [ 1.948366] ata1.00: ATA-8: ATP IG mSATA, 20150311, max UDMA/133 [ 1.948371] ata1.00: 7732368 sectors, multi 0: LBA [ 1.948843] ata1.00: failed to get Identify Device Data, Emask 0x1 [ 1.948857] ata1.00: failed to set xfermode (err_mask=0x40) [ 7.467557] ata1: Signature Update detected @ 504 msecs [ 7.639560] ata1: SATA link up 3.0 Gbps (SStatus 123 SControl 300) [ 7.651320] ata1.00: failed to get Identify Device Data, Emask 0x1 [ 7.651360] ata1.00: failed to set xfermode (err_mask=0x40) [ 7.655628] ata1: limiting SATA link speed to 1.5 Gbps [ 7.659458] ata1.00: limiting speed to UDMA/133:PIO3 [ 13.163554] ata1: Signature Update detected @ 504 msecs [ 13.335558] ata1: SATA link up 3.0 Gbps (SStatus 123 SControl 300) [ 13.347298] ata1.00: failed to get Identify Device Data, Emask 0x1 [ 13.347334] ata1.00: failed to set xfermode (err_mask=0x40) [ 13.351601] ata1.00: disabled [ 13.353278] ata1: exception Emask 0x50 SAct 0x0 SErr 0x800 action 0x6 frozen t4 [ 13.359281] ata1: SError: { HostInt } [ 13.361644] ata1: hard resetting link Signed-off-by: Andreas Werner <andreas.werner@men.de> Signed-off-by: Tejun Heo <tj@kernel.org> commit ea013a9b205b47b1fcbc72522146fad560af0712 Author: Andreas Werner <andreas.werner@men.de> Date: Fri Dec 4 18:12:49 2015 +0100 libata-eh.c: Introduce new ata port flag for controller which lockup on read log page Some controller lockup on a ata_read_log_page. Add new ata port flag ATA_FLAG_NO_LOG_PAGE which can used to blacklist a controller. If this flag is set, any attempt to read a log page returns an error without actually issuing the command. Signed-off-by: Andreas Werner <andreas.werner@men.de> Signed-off-by: Tejun Heo <tj@kernel.org> commit f2000f11fc88b8030962ea6aa6e06dfc3df8bb10 Author: Borislav Petkov <bp@suse.de> Date: Mon Nov 30 14:20:41 2015 +0100 EDAC: Remove edac_get_sysfs_subsys() error handling It cannot fail now. We either load EDAC core after having successfully initialized edac_subsys or we don't. Signed-off-by: Borislav Petkov <bp@suse.de> commit ab77579321508e4234213c71d0a187bca67e550b Author: Borislav Petkov <bp@suse.de> Date: Mon Nov 30 14:15:31 2015 +0100 EDAC: Unexport and make edac_subsys static ... and use the accessor instead. Signed-off-by: Borislav Petkov <bp@suse.de> commit f893180b79f6ada44068e4fe764eb2de70ee6bea Author: Dan Williams <dan.j.williams@intel.com> Date: Sat Dec 5 16:18:44 2015 -0800 ahci: compile out msi/msix infrastructure Quoting Arnd: The AHCI driver is used for some on-chip devices that do not use PCI for probing, and it can be built even when CONFIG_PCI is disabled, but that now results in a build failure: ata/libahci.c: In function 'ahci_host_activate_multi_irqs': ata/libahci.c:2475:4: error: invalid use of undefined type 'struct msix_entry' ata/libahci.c:2475:21: error: dereferencing pointer to incomplete type 'struct msix_entry' Add ifdef CONFIG_PCI_MSI infrastructure to compile out the multi-msi and multi-msix code. Reported-by: Arnd Bergmann <arnd@arndb.de> Tested--by: Arnd Bergmann <arnd@arndb.de> [arnd: fix up pci enabled case] Reported-by: Paul Gortmaker <paul.gortmaker@windriver.com> Fixes: d684a90d38e2 ("ahci: per-port msix support") Signed-off-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Tejun Heo <tj@kernel.org> commit 7e22c0024cf89404407f19955eab39b6d66de7b6 Author: Heiner Kallweit <hkallweit1@gmail.com> Date: Sun Dec 6 21:56:33 2015 +0100 ata: core: fix irq description on AHCI single irq systems On my machine with single irq AHCI just the PCI id is printed as description in /proc/interrupts. I found a related discussion from beginning of this year: http://www.gossamer-threads.com/lists/linux/kernel/2117335 Seems like 4f37b504768c ("libata: Use dev_name() for request_irq() to distinguish devices") tried to fix displaying a proper interrupt description for one scenario but broke it for another one. The mentioned discussion ended in the current situation being considered as broken but w/o a patch to fix it. The following patch is based on a proposal in this mail thread. Now the interrupt is properly described as: PCI-MSI 512000-edge ahci[0000:00:1f.2] By combining both values also the scenario that commit 4f37b504768c ("libata: Use dev_name() for request_irq() to distinguish devices") refers to should still be fine. There it should look like this now: ahci[20100000.ide] Using managed memory allocation ensures that the irq description lives at least as long as the interrupt. Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Signed-off-by: Tejun Heo <tj@kernel.org> Cc: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com> commit 8d8fcba6d1eabcb11ea0a6027d150a7f2cd0e019 Author: Borislav Petkov <bp@suse.de> Date: Fri Nov 27 11:40:43 2015 +0100 EDAC: Rip out the edac_subsys reference counting This was really dumb - reference counting for the main EDAC sysfs object. While we could've simply registered it as the first thing in the module init path and then hand it around to what needs it. Do that and rip out all the code around it, thus simplifying the whole handling significantly. Move the edac_subsys node back to edac_module.c. Signed-off-by: Borislav Petkov <bp@suse.de> commit 6d1a2adef782d26113d4f18a617ccb33c4774d54 Author: Alexey Brodkin <abrodkin@synopsys.com> Date: Mon Dec 7 14:21:37 2015 +0300 ARC: [axs10x] cap ethernet phy to 100 Mbit/sec Current ARC SDP boards cannot reliably handle 1Gbit Ethernet connections due to limitations in hardware. To make sure networking is stable on the board we're limiting phy to 100 Mbit. Signed-off-by: Alexey Brodkin <abrodkin@synopsys.com> Signed-off-by: Vineet Gupta <vgupta@synopsys.com> commit 4c835b57b8de88aef8446867701034128a8a3522 Author: Nicolas Iooss <nicolas.iooss_linux@m4x.org> Date: Wed Nov 18 19:07:15 2015 +0100 fixdep: constify strrcmp arguments strrcmp only performs read access to the memory addressed by its arguments so make them const pointers. Signed-off-by: Nicolas Iooss <nicolas.iooss_linux@m4x.org> Signed-off-by: Michal Marek <mmarek@suse.com> commit b46ae2f3e2882fad71630a6b7c5ea23fa8bc9b84 Author: Borislav Petkov <bp@suse.de> Date: Fri Nov 27 10:38:38 2015 +0100 EDAC: Robustify workqueues destruction EDAC workqueue destruction is really fragile. We cancel delayed work but if it is still running and requeues itself, we still go ahead and destroy the workqueue and the queued work explodes when workqueue core attempts to run it. Make the destruction more robust by switching op_state to offline so that requeuing stops. Cancel any pending work *synchronously* too. EDAC i7core: Driver loaded. general protection fault: 0000 [#1] SMP CPU 12 Modules linked in: Supported: Yes Pid: 0, comm: kworker/0:1 Tainted: G IE 3.0.101-0-default #1 HP ProLiant DL380 G7 RIP: 0010:[<ffffffff8107dcd7>] [<ffffffff8107dcd7>] __queue_work+0x17/0x3f0 < ... regs ...> Process kworker/0:1 (pid: 0, threadinfo ffff88019def6000, task ffff88019def4600) Stack: ... Call Trace: call_timer_fn run_timer_softirq __do_softirq call_softirq do_softirq irq_exit smp_apic_timer_interrupt apic_timer_interrupt intel_idle cpuidle_idle_call cpu_idle Code: ... RIP __queue_work RSP <...> Signed-off-by: Borislav Petkov <bp@suse.de> Cc: <stable@vger.kernel.org> commit 138500d1da4ec2a24025891ddc345151189ece5e Author: Borislav Petkov <bp@suse.de> Date: Tue Dec 1 15:52:36 2015 +0100 EDAC, mc_sysfs: Fix freeing bus' name I get the splat below when modprobing/rmmoding EDAC drivers. It happens because bus->name is invalid after bus_unregister() has run. The Code: section below corresponds to: .loc 1 1108 0 movq 672(%rbx), %rax # mci_1(D)->bus, mci_1(D)->bus .loc 1 1109 0 popq %rbx # .loc 1 1108 0 movq (%rax), %rdi # _7->name, jmp kfree # and %rax has some funky stuff 2030203020312030 which looks a lot like something walked over it. Fix that by saving the name ptr before doing stuff to string it points to. general protection fault: 0000 [#1] SMP Modules linked in: ... CPU: 4 PID: 10318 Comm: modprobe Tainted: G I EN 3.12.51-11-default+ #48 Hardware name: HP ProLiant DL380 G7, BIOS P67 05/05/2011 task: ffff880311320280 ti: ffff88030da3e000 task.ti: ffff88030da3e000 RIP: 0010:[<ffffffffa019da92>] [<ffffffffa019da92>] edac_unregister_sysfs+0x22/0x30 [edac_core] RSP: 0018:ffff88030da3fe28 EFLAGS: 00010292 RAX: 2030203020312030 RBX: ffff880311b4e000 RCX: 000000000000095c RDX: 0000000000000001 RSI: ffff880327bb9600 RDI: 0000000000000286 RBP: ffff880311b4e750 R08: 0000000000000000 R09: ffffffff81296110 R10: 0000000000000400 R11: 0000000000000000 R12: ffff88030ba1ac68 R13: 0000000000000001 R14: 00000000011b02f0 R15: 0000000000000000 FS: 00007fc9bf8f5700(0000) GS:ffff8801a7c40000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b CR2: 0000000000403c90 CR3: 000000019ebdf000 CR4: 00000000000007e0 Stack: Call Trace: i7core_unregister_mci.isra.9 i7core_remove pci_device_remove __device_release_driver driver_detach bus_remove_driver pci_unregister_driver i7core_exit SyS_delete_module system_call_fastpath 0x7fc9bf426536 Code: 2e 0f 1f 84 00 00 00 00 00 66 66 66 66 90 53 48 89 fb e8 52 2a 1f e1 48 8b bb a0 02 00 00 e8 46 59 1f e1 48 8b 83 a0 02 00 00 5b <48> 8b 38 e9 26 9a fe e0 66 0f 1f 44 00 00 66 66 66 66 90 48 8b RIP [<ffffffffa019da92>] edac_unregister_sysfs+0x22/0x30 [edac_core] RSP <ffff88030da3fe28> Signed-off-by: Borislav Petkov <bp@suse.de> Cc: Mauro Carvalho Chehab <mchehab@osg.samsung.com> Cc: <stable@vger.kernel.org> # v3.6.. Fixes: 7a623c039075 ("edac: rewrite the sysfs code to use struct device") commit 02f6ff90400d055f08b0ba0b5f0707630b6faed7 Author: David Henningsson <david.henningsson@canonical.com> Date: Mon Dec 7 11:29:31 2015 +0100 ALSA: hda - Add inverted dmic for Packard Bell DOTS On the internal mic of the Packard Bell DOTS, one channel has an inverted signal. Add a quirk to fix this up. Cc: stable@vger.kernel.org BugLink: https://bugs.launchpad.net/bugs/1523232 Signed-off-by: David Henningsson <david.henningsson@canonical.com> Signed-off-by: Takashi Iwai <tiwai@suse.de> commit 1b894521e60c1b91db1e8ba1278660e5c89f1b5f Author: Ilan Peer <ilan.peer@intel.com> Date: Sun Dec 6 21:19:15 2015 +0200 mac80211: handle HW ROC expired properly In case of HW ROC, when the driver reports that the ROC expired, it is not sufficient to purge the ROCs based on the remaining time, as it possible that the device finished the ROC session before the actual requested duration. To handle such cases, in case of ROC expired notification from the driver, complete all the ROCs which are marked with hw_begun, regardless of the remaining duration. Signed-off-by: Ilan Peer <ilan.peer@intel.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com> commit 0d014ff344abc9c8e56cf1870ab3a144d2e2e37a Author: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Date: Thu Nov 19 16:07:17 2015 +0100 drm/i915: Remove double wait_for_vblank on broadwell. wait_vblank is already set in intel_plane_atomic_calc_changes for broadwell, waiting for a double vblank is overkill. Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1447945645-32005-5-git-send-email-maarten.lankhorst@linux.intel.com commit b900111459e2f4a538697f75b63478f3a6acec3c Author: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Date: Thu Nov 19 16:07:16 2015 +0100 drm/i915/skl: Update watermarks before the crtc is disabled. On skylake some of the registers are only writable when the correct power wells are enabled. Because of this watermarks have to be updated before the crtc turns off, or you get unclaimed register read and write warnings. This patch needs to be modified slightly to apply to -fixes. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=92181 Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Cc: stable@vger.kernel.org Cc: Matt Roper <matthew.d.roper@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1447945645-32005-4-git-send-email-maarten.lankhorst@linux.intel.com Reviewed-by: Ander Conselvan de Oliveira <conselvan2@gmail.com> commit 526c0ab13672fa2ec0fddb7efa9a7797e11fddea Author: Damien Riegel <damien.riegel@savoirfairelinux.com> Date: Mon Nov 30 10:59:47 2015 -0500 mfd: syscon: Add a DT property to set value width Currently syscon has a fixed configuration of 32 bits for register and values widths. In some cases, it would be desirable to be able to customize the value width. For example, certain boards (like the ones manufactured by Technologic Systems) have a FPGA that is memory-mapped, but its registers are only 16-bit wide. This patch adds an optional "reg-io-width" DT binding for syscon that allows to change the width for the data bus (i.e. val_bits). If this property is provided, it will also set the register stride to reg-io-width's value. If not provided, the default configuration is used. Signed-off-by: Damien Riegel <damien.riegel@savoirfairelinux.com> Acked-by: Rob Herring <robh@kernel.org> Acked-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Lee Jones <lee.jones@linaro.org> commit 92826fcdfc147a7d16766e987c12a9dfe1860c3f Author: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Date: Thu Dec 3 13:49:13 2015 +0100 drm/i915: Calculate watermark related members in the crtc_state, v4. This removes pre/post_wm_update from intel_crtc->atomic, and creates atomic state for it in intel_crtc. Changes since v1: - Rebase on top of wm changes. Changes since v2: - Split disable_cxsr into a separate patch. Changes since v3: - Move some of the changes to intel_wm_need_update. Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/56603A49.5000507@linux.intel.com Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> commit ab1d3a0e5a44f5b1a8d1f811e925c8519b56fba4 Author: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Date: Thu Nov 19 16:07:14 2015 +0100 drm/i915: Move disable_cxsr to the crtc_state. intel_crtc->atomic will be removed later on, move this member to intel_crtc_state. Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1447945645-32005-2-git-send-email-maarten.lankhorst@linux.intel.com Reviewed-by: Ander Conselvan de Oliveira <conselvan2@gmail.com> commit c10368897e104c008c610915a218f0fe5fa4ec96 Author: Ravindra Lokhande <rlokhande@nvidia.com> Date: Mon Dec 7 12:08:31 2015 +0530 ALSA: compress: add support for 32bit calls in a 64bit kernel Compress offload does not support ioctl calls from a 32bit userspace in a 64 bit kernel. This patch adds support for ioctls from a 32bit userspace in a 64bit kernel Signed-off-by: Ravindra Lokhande <rlokhande@nvidia.com> Acked-by: Vinod Koul <vinod.koul@intel.com> Signed-off-by: Takashi Iwai <tiwai@suse.de> commit ee65ad0e2a9e5c1a61f86be9365c17f4fddb3818 Author: Philipp Zabel <p.zabel@pengutronix.de> Date: Wed Nov 18 18:12:25 2015 +0100 backlight: pwm_bl: Avoid backlight flicker when probed from DT If the driver is probed from the device tree, and there is a phandle property set on it, and the enable GPIO is already configured as output, and the backlight is currently disabled, keep it disabled. If all these conditions are met, assume there will be some other driver that can enable the backlight at the appropriate time. Signed-off-by: Philipp Zabel <p.zabel@pengutronix.de> Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com> Tested-by: Heiko Stuebner <heiko@sntech.de> Signed-off-by: Lee Jones <lee.jones@linaro.org> commit 7c23b7c1996597dd9d60bb282fb5fa1be6ebd18b Author: Lu, Han <han.lu@intel.com> Date: Mon Dec 7 15:59:13 2015 +0800 ALSA: hda - Fix playback noise with 24/32 bit sample size on BXT In BXT-P A0, HD-Audio DMA requests is later than expected, and makes an audio stream sensitive to system latencies when 24/32 bits are playing. Adjusting threshold of DMA fifo to force the DMA request sooner to improve latency tolerance at the expense of power. v2: move Intel specific code to hda_intel.c Signed-off-by: Lu, Han <han.lu@intel.com> Signed-off-by: Takashi Iwai <tiwai@suse.de> commit a4d8a0fe4500b87817eebdb363c116922de87453 Author: Zeng Zhaoxiu <zhaoxiu.zeng@gmail.com> Date: Sun Dec 6 18:26:30 2015 +0800 i915: Replace "hweight8(dev_priv->info.subslice_7eu[i]) != 1" with "!is_power_of_2(dev_priv->info.subslice_7eu[i])" Signed-off-by: Zeng Zhaoxiu <zhaoxiu.zeng@gmail.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Link: http://patchwork.freedesktop.org/patch/msgid/1449397590-14292-1-git-send-email-zhaoxiu.zeng@gmail.com commit f00be687f5fd2b626aa975dc4513cad525823c05 Author: Simon Horman <horms+renesas@verge.net.au> Date: Thu Nov 26 13:26:20 2015 +0900 ARM: shmobile: r8a7793: remove deprecated #gpio-range-cells Commit a1bc260bb5f5d9 ("gpio: clean up gpio-ranges documentation") declares the above property deprecated. That was more than 2 years ago. Remove it, so it doesn't get copied around needlessly. Based on similar work for the r8a7791 and r8a7794 by Wolfram Sang. Cc: Wolfram Sang <wsa+renesas@sang-engineering.com> Signed-off-by: Simon Horman <horms+renesas@verge.net.au> Reported-by: Wolfram Sang <wsa+renesas@sang-engineering.com> Acked-by: Wolfram Sang <wsa+renesas@sang-engineering.com> commit 46ece349aa54f6e55b5f8d30bbecbaf2884ba869 Author: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com> Date: Thu Dec 3 01:23:03 2015 +0300 ARM: shmobile: r8a7791: add EtherAVB DT support Define the generic R8A7791 part of the EtherAVB device node. Based on the commit f25d6b977240 ("ARM: shmobile: r8a7790: add EtherAVB DT support"). Signed-off-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com> Signed-off-by: Simon Horman <horms+renesas@verge.net.au> commit eaa870b3055384092d8fc075bca3a3a819f73c43 Author: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com> Date: Thu Dec 3 01:21:49 2015 +0300 ARM: shmobile: r8a7791: add EtherAVB clock Add the EtherAVB clock to the R8A7791 device tree. Based on the commit 63d2d750c902 ("ARM: shmobile: r8a7790: add EtherAVB clocks"). Signed-off-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com> Signed-off-by: Simon Horman <horms+renesas@verge.net.au> commit 64874280889e7c0b2c9266705363627d4c92cf01 Author: Rainer Weikusat <rweikusat@mobileactivedefense.com> Date: Sun Dec 6 21:11:38 2015 +0000 af_unix: fix unix_dgram_recvmsg entry locking The current unix_dgram_recvsmg code acquires the u->readlock mutex in order to protect access to the peek offset prior to calling __skb_recv_datagram for actually receiving data. This implies that a blocking reader will go to sleep with this mutex held if there's presently no data to return to userspace. Two non-desirable side effects of this are that a later non-blocking read call on the same socket will block on the ->readlock mutex until the earlier blocking call releases it (or the readers is interrupted) and that later blocking read calls will wait longer than the effective socket read timeout says they should: The timeout will only start 'ticking' once such a reader hits the schedule_timeout in wait_for_more_packets (core.c) while the time it already had to wait until it could acquire the mutex is unaccounted for. The patch avoids both by using the __skb_try_recv_datagram and __skb_wait_for_more packets functions created by the first patch to implement a unix_dgram_recvmsg read loop which releases the readlock mutex prior to going to sleep and reacquires it as needed afterwards. Non-blocking readers will thus immediately return with -EAGAIN if there's no data available regardless of any concurrent blocking readers and all blocking readers will end up sleeping via schedule_timeout, thus honouring the configured socket receive timeout. Signed-off-by: Rainer Weikusat <rweikusat@mobileactivedefense.com> Signed-off-by: David S. Miller <davem@davemloft.net> commit ea3793ee29d3621faf857fa8ef5425e9ff9a756d Author: Rainer Weikusat <rweikusat@mobileactivedefense.com> Date: Sun Dec 6 21:11:34 2015 +0000 core: enable more fine-grained datagram reception control The __skb_recv_datagram routine in core/ datagram.c provides a general skb reception factility supposed to be utilized by protocol modules providing datagram sockets. It encompasses both the actual recvmsg code and a surrounding 'sleep until data is available' loop. This is inconvenient if a protocol module has to use additional locking in order to maintain some per-socket state the generic datagram socket code is unaware of (as the af_unix code does). The patch below moves the recvmsg proper code into a new __skb_try_recv_datagram routine which doesn't sleep and renames wait_for_more_packets to __skb_wait_for_more_packets, both routines being exported interfaces. The original __skb_recv_datagram routine is reimplemented on top of these two functions such that its user-visible behaviour remains unchanged. Signed-off-by: Rainer Weikusat <rweikusat@mobileactivedefense.com> Signed-off-by: David S. Miller <davem@davemloft.net> commit dc15c60e39a3092df845f9bfc29b9c40cc63934d Author: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Date: Sun Dec 6 20:20:14 2015 -0800 rcutorture: Don't keep empty console.log.diags files Currently, an error-free run produces an empty console.log.diags file. This can be annoying when using "vi */console.log.diags" to see a full summary of the errors. This commit therefore removes any empty files during the analysis process. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> commit 0485e1b35866693ff3ec975ca4c3c8ea8db63678 Author: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Date: Sun Dec 6 20:18:37 2015 -0800 rcutorture: Add checks for rcutorture writer starvation This commit adds checks for rcutorture writer starvation, so that instances will be added to the test summary. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> commit 7bf9ae016efc0cf08263fbee5ac708c23b90792e Author: Andrew Lunn <andrew@lunn.ch> Date: Mon Dec 7 04:38:58 2015 +0100 PHY: DP83867: Remove looking in parent device for OF properties Device tree properties for a phy device are expected to be in the phy node. The current code for the DP83867 also tries to look in the parent node. The devices binding documentation does not mention this, no current device tree file makes use of this, and it is not behaviour we want. So remove looking in the parent device. Signed-off-by: Andrew Lunn <andrew@lunn.ch> Acked-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> commit 404814af69d4732276319b90886b81fb2884ae1b Author: Bjørn Mork <bjorn@mork.no> Date: Sun Dec 6 22:47:15 2015 +0100 net: cdc_ncm: add "ndp_to_end" sysfs attribute Adding a writable sysfs attribute for the "NDP to end" quirk flag. This makes it easier for end users to test new devices for this firmware bug. We've been lucky so far, but we should not depend on reporters capable of rebuilding the driver. Cc: Enrico Mioso <mrkiko.rs@gmail.com> Signed-off-by: Bjørn Mork <bjorn@mork.no> Signed-off-by: David S. Miller <davem@davemloft.net> commit e57968a10bc1b3da6d2b8b0bdbbe4c5a43de2ad1 Author: Moni Shoua <monis@mellanox.com> Date: Sun Dec 6 18:07:43 2015 +0200 net/mlx4_core: Support the HA mode for SRIOV VFs too When the mlx4 driver runs in HA mode, and all VFs are single ported ones, we make their single port Highly-Available. This is done by taking advantage of the HA mode properties (following bonding changes with programming the port V2P map, etc) and adding the missing parts which are unique to SRIOV such as mirroring VF steering rules on both ports. Due to limits on the MAC and VLAN table this mode is enabled only when number of total VFs is under 64. Signed-off-by: Moni Shoua <monis@mellanox.com> Reviewed-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net> commit f1b4e12a9ab6cc08b1a047bf021d90c7e07b1517 Author: Or Gerlitz <ogerlitz@mellanox.com> Date: Sun Dec 6 18:07:42 2015 +0200 IB/mlx4: Use the VF base-port when demuxing mad from wire Under HA mode, it's possible that the VF registered its GID (and expects to get mads through the PV scheme) on a port which is different from the one this mad arrived on, due to HA fail over. Therefore, if the gid is not matched on the port that the packet arrived on, check for a match on the other port if HA mode is active -- and if a match is found on the other port, continue processing the mad using that other port. Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Reviewed-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: David S. Miller <davem@davemloft.net> commit 5f61385d2ebc2bd62bc389c7da0d8d2f263be1eb Author: Moni Shoua <monis@mellanox.com> Date: Sun Dec 6 18:07:41 2015 +0200 net/mlx4_core: Keep VLAN/MAC tables mirrored in multifunc HA mode Due to HW limitations, indexes to MAC and VLAN tables are always taken from the table of the actual port. So, if a resource holds an index to a table, it may refer to different values during the lifetime of the resource, unless the tables are mirrored. Also, even when driver is not in HA mode the policy of allocating an index to these tables is such to make sure, as much as possible, that when the time comes the mirroring will be successful. This means that in multifunction mode the allocation of a free index in a port's table tries to make sure that the same index in the other's port table is also free. Signed-off-by: Moni Shoua <monis@mellanox.com> Reviewed-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net> commit 78efed275117b189f06f8937798eab5cba53ed18 Author: Moni Shoua <monis@mellanox.com> Date: Sun Dec 6 18:07:40 2015 +0200 net/mlx4_core: Support mirroring VF DMFS rules on both ports Under HA mode, steering rules set by VFs should be mirrored on both ports of the device so packets will be accepted no matter on which port they arrived. Since getting into HA mode is done dynamically when the user bonds mlx4 Ethernet netdevs, we keep hold of the VF DMFS rule mbox with the port value flipped (1->2,2->1) and execute the mirroring when getting into HA mode. Later, when going out of HA mode, we unset the mirrored rules. In that context note that mirrored rules cannot be removed explicitly. Signed-off-by: Moni Shoua <monis@mellanox.com> Reviewed-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net> commit 8d80d04a521d4acf949a449403040c38ec648f57 Author: Moni Shoua <monis@mellanox.com> Date: Sun Dec 6 18:07:39 2015 +0200 net/mlx4_core: Use both physical ports to dispatch link state events to VF Under HA mode, the link down event should be sent to VFs only if both ports are down. Signed-off-by: Moni Shoua <monis@mellanox.com> Reviewed-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net> commit e34305c85f86112838cb332696513bed8e04a211 Author: Or Gerlitz <ogerlitz@mellanox.com> Date: Sun Dec 6 18:07:38 2015 +0200 net/mlx4_core: Use both physical ports to set the VF link state In HA mode, the link state for VFs for which the policy is "auto" (i.e. follow the physical link state) should be ORed from both ports. Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Reviewed-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: David S. Miller <davem@davemloft.net> commit 0d76d6e8b2507983a2cae4c09880798079007421 Author: Julia Lawall <julia.lawall@lip6.fr> Date: Sun Dec 6 06:56:23 2015 +0100 VSOCK: fix returnvar.cocci warnings Remove unneeded variable used to store return value. Generated by: scripts/coccinelle/misc/returnvar.cocci CC: Asias He <asias@redhat.com> Signed-off-by: Fengguang Wu <fengguang.wu@intel.com> Signed-off-by: Julia Lawall <julia.lawall@lip6.fr> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net> commit eeaef728978bc4da94d04d5e44bb81221279c418 Author: Loic Poulain <loic.poulain@intel.com> Date: Sun Dec 6 16:18:34 2015 +0100 Bluetooth: btintel: Create common Intel Version Read function The Intel Version Read command is used to retrieve information about hardware and firmware version/revision of Intel Bluetooth controllers. This is an Intel generic command used in USB and UART drivers. Signed-off-by: Loic Poulain <loic.poulain@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org> commit 326fcfa5acca446b3f71e99f6d19881145556e5c Author: Felix Fietkau <nbd@openwrt.org> Date: Sat Dec 5 13:58:11 2015 +0100 net: remove unnecessary semicolon in netdev_alloc_pcpu_stats() This semicolon causes a build error if the function call is wrapped in parentheses. Fixes: aabc92bbe3cf ("net: add __netdev_alloc_pcpu_stats() to indicate gfp flags") Reported-by: Imre Kaloz <kaloz@openwrt.org> Signed-off-by: Felix Fietkau <nbd@openwrt.org> Acked-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: David S. Miller <davem@davemloft.net> commit 8a0d19c5ed417c78d03f4e0fa7215e58c40896d8 Author: lucien <lucien.xin@gmail.com> Date: Sat Dec 5 15:35:36 2015 +0800 sctp: start t5 timer only when peer rwnd is 0 and local state is SHUTDOWN_PENDING when A sends a data to B, then A close() and enter into SHUTDOWN_PENDING state, if B neither claim his rwnd is 0 nor send SACK for this data, A will keep retransmitting this data until t5 timeout, Max.Retrans times can't work anymore, which is bad. if B's rwnd is not 0, it should send abort after Max.Retrans times, only when B's rwnd == 0 and A's retransmitting beyonds Max.Retrans times, A will start t5 timer, which is also commit f8d960524328 ("sctp: Enforce retransmission limit during shutdown") means, but it lacks the condition peer rwnd == 0. so fix it by adding a bit (zero_window_announced) in peer to record if the last rwnd is 0. If it was, zero_window_announced will be set. and use this bit to decide if start t5 timer when local.state is SHUTDOWN_PENDING. Fixes: commit f8d960524328 ("sctp: Enforce retransmission limit during shutdown") Signed-off-by: Xin Long <lucien.xin@gmail.com> Signed-off-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> commit 6c730080e663b1d629f8aa89348291fbcdc46cd9 Author: Bjørn Mork <bjorn@mork.no> Date: Sun Dec 6 21:25:50 2015 +0100 net: qmi_wwan: should hold RTNL while changing netdev type The notifier calls were thrown in as a last-minute fix for an imagined "this device could be part of a bridge" problem. That revealed a certain lack of locking. Not to mention testing... Avoid this splat: RTNL: assertion failed at net/core/dev.c (1639) CPU: 0 PID: 4293 Comm: bash Not tainted 4.4.0-rc3+ #358 Hardware name: LENOVO 2776LEG/2776LEG, BIOS 6EET55WW (3.15 ) 12/19/2011 0000000000000000 ffff8800ad253d60 ffffffff8122f7cf ffff8800ad253d98 ffff8800ad253d88 ffffffff813833ab 0000000000000002 ffff880230f48560 ffff880230a12900 ffff8800ad253da0 ffffffff813833da 0000000000000002 Call Trace: [<ffffffff8122f7cf>] dump_stack+0x4b/0x63 [<ffffffff813833ab>] call_netdevice_notifiers_info+0x3d/0x59 [<ffffffff813833da>] call_netdevice_notifiers+0x13/0x15 [<ffffffffa09be227>] raw_ip_store+0x81/0x193 [qmi_wwan] [<ffffffff8131e149>] dev_attr_store+0x20/0x22 [<ffffffff811d858b>] sysfs_kf_write+0x49/0x50 [<ffffffff811d8027>] kernfs_fop_write+0x10a/0x151 [<ffffffff8117249a>] __vfs_write+0x26/0xa5 [<ffffffff81085ed4>] ? percpu_down_read+0x53/0x7f [<ffffffff81174c9e>] ? __sb_start_write+0x5f/0xb0 [<ffffffff81174c9e>] ? __sb_start_write+0x5f/0xb0 [<ffffffff81172c37>] vfs_write+0xa3/0xe7 [<ffffffff811734ad>] SyS_write+0x50/0x7e [<ffffffff8145c517>] entry_SYSCALL_64_fastpath+0x12/0x6f Fixes: 32f7adf633b9 ("net: qmi_wwan: support "raw IP" mode") Signed-off-by: Bjørn Mork <bjorn@mork.no> Signed-off-by: David S. Miller <davem@davemloft.net> commit dc085392a3a760b5788db368c1b2c116be08b201 Author: Inki Dae <inki.dae@samsung.com> Date: Thu Dec 3 14:35:23 2015 +0900 drm/exynos: dsi: modify a error type when getting a node failed This patch makes it to return -EINVAL instead of -ENXIO when getting a port or remote node failed. Signed-off-by: Inki Dae <inki.dae@samsung.com> Reviewed-by: Javier Martinez Canillas <javier@osg.samsung.com> commit 5e4500d8dba16d88b528cf037566b84747ec23f0 Author: Viresh Kumar <viresh.kumar@linaro.org> Date: Thu Dec 3 09:37:52 2015 +0530 cpufreq: governor: initialize/destroy timer_mutex with 'shared' timer_mutex is required to be initialized only while memory for 'shared' is allocated and in a similar way it is required to be destroyed only when memory for 'shared' is freed. There is no need to do the same every time we start/stop the governor. Move code to initialize/destroy timer_mutex to the relevant places. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> commit affde5d06af1e39c2929e36a063e3912f02fc58f Author: Viresh Kumar <viresh.kumar@linaro.org> Date: Thu Dec 3 09:37:51 2015 +0530 cpufreq: governor: Pass policy as argument to ->gov_dbs_timer() Pass 'policy' as argument to ->gov_dbs_timer() instead of cdbs and dbs_data. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> commit e68fe18c5b5442baca162ccf3b273326e6132a51 Author: Viresh Kumar <viresh.kumar@linaro.org> Date: Thu Dec 3 09:37:50 2015 +0530 cpufreq: ondemand: Work is guaranteed to be pending We are guaranteed to have works scheduled for policy->cpus, as the policy isn't stopped yet. And so there is no need to check that again. Drop it. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> commit e128c864070055e062f6c90c64c03aad18452ac3 Author: Viresh Kumar <viresh.kumar@linaro.org> Date: Thu Dec 3 09:37:49 2015 +0530 cpufreq: ondemand: Update sampling rate only for concerned policies We are comparing policy->governor against cpufreq_gov_ondemand to make sure that we update sampling rate only for the concerned CPUs. But that isn't enough. In case of governor_per_policy, there can be multiple instances of ondemand governor and we will always end up updating all of them with current code. What we rather need to do, is to compare dbs_data with poilcy->governor_data, which will match only for the policies governed by dbs_data. This code is also racy as the governor might be getting stopped at that time and we may end up scheduling work for a policy, which we have just disabled. Fix that by protecting the entire function with &od_dbs_cdata.mutex, which will prevent against races with policy START/STOP/etc. After these locks are in place, we can safely get the policy via per-cpu dbs_info. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> commit 7b06a6d7bff563d82ddf8769617632f26793a83e Author: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Date: Sat Dec 5 01:54:47 2015 +0100 MAINTAINERS: Add an entry for the PM core Add a MAINTAINERS entry for the PM core with myself as the maintainer and linux-pm as the mailing list. This actually documents the current state of things. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> commit ce657b1cddf1f88c56ae683efa7130341c92808b Author: Russell King <rmk+kernel@arm.linux.org.uk> Date: Tue Nov 17 12:08:01 2015 +0000 component: add support for releasing match data The component helper treats the void match data pointer as an opaque object which needs no further management. When device nodes being passed, this is not true: the caller should pass its refcount to the component helper, and there should be a way to drop the refcount when the matching information is destroyed. This patch provides a per-match release method in addition to the match method to solve this issue. Rather than using component_match_add(), users should use component_match_add_release() which takes an additional function pointer for releasing this reference. Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk> commit ffc30b74fd6d01588bd3fdebc3b1acc0857e6fc8 Author: Russell King <rmk+kernel@arm.linux.org.uk> Date: Fri Apr 18 23:05:53 2014 +0100 component: track components via array rather than list Since we now have an array which defines each component, maintain the components to be bound in the array rather than a separate list. We also need duplicate tracking so we can eliminate multiple bind calls for the same component: we preserve the list-based component order in that the first match which adds the component determines its position. Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk> commit 29f1c7fd61a31e0335ce41d4b2788959ad7c468d Author: Russell King <rmk+kernel@arm.linux.org.uk> Date: Wed Apr 23 10:46:11 2014 +0100 component: move check for unbound master into try_to_bring_up_masters() Clean up the code a little; we don't need to check that the master is unbound for every invocation of try_to_bring_up_master(), so let's move it to where it's really needed - try_to_bring_up_masters(), where we may encounter already bound masters. Reviewed-by: Thierry Reding <treding@nvidia.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk> commit fae9e2e07af07baabb8c26a31b3f7d8fdf89809e Author: Russell King <rmk+kernel@arm.linux.org.uk> Date: Fri Apr 18 22:10:32 2014 +0100 component: remove old add_components method Now that drivers create an array of component matches at probe time, we can retire the old methods. This involves removing the add_components master method, and removing component_master_add_child() from public view. We also remove component_add_master() as that interface is no longer useful. Acked-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk> commit 072151bb01547816d82ec06565de5559d679fbeb Author: Jordan Hargrave <jharg93@gmail.com> Date: Mon Dec 7 10:07:15 2015 +1100 firmware: dmi_scan: Save SMBIOS Type 9 System Slots Save SMBIOS Type 9 System Slots during DMI scan. PCI address of onboard devices was already saved but not for slots. Signed-off-by: Jordan Hargrave <jordan_hargrave@dell.com> Signed-off-by: Jean Delvare <jdelvare@suse.de> commit c625c7eb2eb48f21aaab5bc762e063f770eee479 Author: Jean Delvare <jdelvare@suse.de> Date: Mon Dec 7 10:07:15 2015 +1100 firmware: dmi_scan: Fix dmi_find_device description The description of dmi_find_device was apparently copied from a similar function in a different subsystem, but the parameter names were not adjusted as needed. Signed-off-by: Jean Delvare <jdelvare@suse.de> Cc: Andrey Panin <pazke@donpac.ru> commit d40ad8376edb951737111330e8fba087e5c1d831 Author: Jean Delvare <jdelvare@suse.de> Date: Mon Dec 7 10:07:15 2015 +1100 firmware: dmi_scan: Clarify dmi_save_extended_devices Get rid of the arbitrary 5-byte pointer offset, it served no purpose and made it harder to match the code with the SMBIOS specification. Signed-off-by: Jean Delvare <jdelvare@suse.de> Cc: Jordan Hargrave <jordan_hargrave@dell.com> Cc: Narendra K <narendra_k@dell.com> commit af94859f09787d2b782743812c61032ddfd6f987 Author: Jean Delvare <jdelvare@suse.de> Date: Mon Dec 7 10:07:14 2015 +1100 firmware: dmi_scan: Optimize dmi_save_extended_devices Calling dmi_string_nosave isn't cheap, so avoid calling it twice in a row for the same string. Signed-off-by: Jean Delvare <jdelvare@suse.de> Cc: Jordan Hargrave <jordan_hargrave@dell.com> Cc: Narendra K <narendra_k@dell.com> commit 54046148cf6826a5a3f61e571b131e0224580d53 Author: Helge Deller <deller@gmx.de> Date: Sun Dec 6 21:56:26 2015 +0100 parisc: Wire up mlock2 syscall Signed-off-by: Helge Deller <deller@gmx.de> commit 73c5e2661b712bb63408555037e6ac38b39e04dc Author: Bjorn Helgaas <bhelgaas@google.com> Date: Tue Dec 1 10:41:47 2015 -0600 parisc: Remove unused pcibios_init_bus() There are no callers of pcibios_init_bus(), so remove it. Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Signed-off-by: Helge Deller <deller@gmx.de> commit f316cb0a60c29cb00a42d6da8d42c48fe067901b Author: Mikulas Patocka <mpatocka@redhat.com> Date: Mon Nov 30 14:47:46 2015 -0500 parisc iommu: fix panic due to trying to allocate too large region When using the Promise TX2+ SATA controller on PA-RISC, the system often crashes with kernel panic, for example just writing data with the dd utility will make it crash. Kernel panic - not syncing: drivers/parisc/sba_iommu.c: I/O MMU @ 000000000000a000 is out of mapping resources CPU: 0 PID: 18442 Comm: mkspadfs Not tainted 4.4.0-rc2 #2 Backtrace: [<000000004021497c>] show_stack+0x14/0x20 [<0000000040410bf0>] dump_stack+0x88/0x100 [<000000004023978c>] panic+0x124/0x360 [<0000000040452c18>] sba_alloc_range+0x698/0x6a0 [<0000000040453150>] sba_map_sg+0x260/0x5b8 [<000000000c18dbb4>] ata_qc_issue+0x264/0x4a8 [libata] [<000000000c19535c>] ata_scsi_translate+0xe4/0x220 [libata] [<000000000c19a93c>] ata_scsi_queuecmd+0xbc/0x320 [libata] [<0000000040499bbc>] scsi_dispatch_cmd+0xfc/0x130 [<000000004049da34>] scsi_request_fn+0x6e4/0x970 [<00000000403e95a8>] __blk_run_queue+0x40/0x60 [<00000000403e9d8c>] blk_run_queue+0x3c/0x68 [<000000004049a534>] scsi_run_queue+0x2a4/0x360 [<000000004049be68>] scsi_end_request+0x1a8/0x238 [<000000004049de84>] scsi_io_completion+0xfc/0x688 [<0000000040493c74>] scsi_finish_command+0x17c/0x1d0 The cause of the crash is not exhaustion of the IOMMU space, there is plenty of free pages. The function sba_alloc_range is called with size 0x11000, thus the pages_needed variable is 0x11. The function sba_search_bitmap is called with bits_wanted 0x11 and boundary size is 0x10 (because dma_get_seg_boundary(dev) returns 0xffff). The function sba_search_bitmap attempts to allocate 17 pages that must not cross 16-page boundary - it can't satisfy this requirement (iommu_is_span_boundary always returns true) and fails even if there are many free entries in the IOMMU space. How did it happen that we try to allocate 17 pages that don't cross 16-page boundary? The cause is in the function iommu_coalesce_chunks. This function tries to coalesce adjacent entries in the scatterlist. The function does several checks if it may coalesce one entry with the next, one of those checks is this: if (startsg->length + dma_len > max_seg_size) break; When it finishes coalescing adjacent entries, it allocates the mapping: sg_dma_len(contig_sg) = dma_len; dma_len = ALIGN(dma_len + dma_offset, IOVP_SIZE); sg_dma_address(contig_sg) = PIDE_FLAG | (iommu_alloc_range(ioc, dev, dma_len) << IOVP_SHIFT) | dma_offset; It is possible that (startsg->length + dma_len > max_seg_size) is false (we are just near the 0x10000 max_seg_size boundary), so the funcion decides to coalesce this entry with the next entry. When the coalescing succeeds, the function performs dma_len = ALIGN(dma_len + dma_offset, IOVP_SIZE); And now, because of non-zero dma_offset, dma_len is greater than 0x10000. iommu_alloc_range (a pointer to sba_alloc_range) is called and it attempts to allocate 17 pages for a device that must not cross 16-page boundary. To fix the bug, we must make sure that dma_len after addition of dma_offset and alignment doesn't cross the segment boundary. I.e. change if (startsg->length + dma_len > max_seg_size) break; to if (ALIGN(dma_len + dma_offset + startsg->length, IOVP_SIZE) > max_seg_size) break; This patch makes this change (it precalculates max_seg_boundary at the beginning of the function iommu_coalesce_chunks). I also added a check that the mapping length doesn't exceed dma_get_seg_boundary(dev) (it is not needed for Promise TX2+ SATA, but it may be needed for other devices that have dma_get_seg_boundary lower than dma_get_max_seg_size). Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Cc: stable@vger.kernel.org Signed-off-by: Helge Deller <deller@gmx.de> commit 9fa547b2b7c37507fb04c93fa2d0ccddf7d4d234 Author: Helge Deller <deller@gmx.de> Date: Thu Nov 26 21:14:02 2015 +0100 parisc: protect huge pte changes with spinlocks Protect all changes of huge page pte entries with purge_tlb_start() and purge_tlb_end() spinlocks. Signed-off-by: Helge Deller <deller@gmx.de> commit 63424eeaab1e61af0c3eecc0f85fb9ee4877f219 Author: Helge Deller <deller@gmx.de> Date: Wed Nov 25 15:39:38 2015 +0100 parisc: Disable tlb flush optimization with huge pages It seems calling flush_tlb_all() doesn't reliable flush the tlb on all CPUs. Disable it when used with huge pages. Signed-off-by: Helge Deller <deller@gmx.de> commit d17fad0a588f399dcda8c6c0c77be0527d85fdd5 Author: Helge Deller <deller@gmx.de> Date: Sun Dec 6 21:25:20 2015 +0100 parisc: Disable huge pages on Mako machines Mako-based machines (PA8800 and PA8900 CPUs) don't allow aliasing on non-equaivalent addresses. Signed-off-by: Helge Deller <deller@gmx.de> commit 7fdc06ab013ec847cd25a66208f403e124e6371b Author: Masahiro Yamada <yamada.masahiro@socionext.com> Date: Tue Nov 24 22:10:26 2015 +0900 of/irq: optimize device node matching loop in of_irq_init() Currently, of_irq_init() iterates over interrupt controller nodes with for_each_matching_node(), and then gets each init function with of_match_node() later. This routine can be optimized with for_each_matching_node_and_match(). It allows to get the interrupt controller node and its init function at the same time, saving __of_match_node() callings. Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com> Signed-off-by: Rob Herring <robh@kernel.org> commit 5e2bfc010f63b3ede1746c53052064fa8fcd7622 Author: Liviu Dudau <Liviu.Dudau@arm.com> Date: Wed Dec 2 11:35:39 2015 +0000 dt-bindings: tda998x: Document the required 'port' node. All the users of the tda998x driver are component based and bind the driver via the device graph method described in Documentation/devicetree/bindings/graph.txt. Add the fact that the 'port' node is required to the bindings. Signed-off-by: Liviu Dudau <Liviu.Dudau@arm.com> Signed-off-by: Rob Herring <robh@kernel.org> commit 8b570dc9f7b634e853866ce40097c0342ac5bb81 Author: lucien <lucien.xin@gmail.com> Date: Sat Dec 5 15:19:27 2015 +0800 sctp: only drop the reference on the datamsg after sending a msg If the chunks are enqueued successfully but sctp_cmd_interpreter() return err to sctp_sendmsg() (mainly because of no mem), the chunks will get re-queued, but we are dropping the reference and freeing them. The fix is to just drop the reference on the datamsg just as it had succeeded, as: - if the chunks weren't queued, this is enough to get them freed. - if they were queued, they will get freed when they finally get out or discarded. Signed-off-by: Xin Long <lucien.xin@gmail.com> Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Signed-off-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> commit 69b5777f2e5779bb987d4a25a33401d5ac257c14 Author: lucien <lucien.xin@gmail.com> Date: Sat Dec 5 15:15:17 2015 +0800 sctp: hold the chunks only after the chunk is enqueued in outq When a msg is sent, sctp will hold the chunks of this msg and then try to enqueue them. But if the chunks are not enqueued in sctp_outq_tail() because of the invalid state, sctp_cmd_interpreter() may still return success to sctp_sendmsg() after calling sctp_outq_flush(), these chunks will become orphans and will leak. So we fix them by moving sctp_chunk_hold() to sctp_outq_tail(), where we are sure that the chunk is going to get queued. Signed-off-by: Xin Long <lucien.xin@gmail.com> Signed-off-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> commit aa1c25666a0c5a37073e06c8308ae93916b1e6df Author: Guenter Roeck <linux@roeck-us.net> Date:…

I get the splat below when modprobing/rmmoding EDAC drivers. It happens because bus->name is invalid after bus_unregister() has run. The Code: section below corresponds to: .loc 1 1108 0 movq 672(%rbx), %rax # mci_1(D)->bus, mci_1(D)->bus .loc 1 1109 0 popq %rbx # .loc 1 1108 0 movq (%rax), %rdi # _7->name, jmp kfree # and %rax has some funky stuff 2030203020312030 which looks a lot like something walked over it. Fix that by saving the name ptr before doing stuff to string it points to. general protection fault: 0000 [#1] SMP Modules linked in: ... CPU: 4 PID: 10318 Comm: modprobe Tainted: G I EN 3.12.51-11-default+ torvalds#48 Hardware name: HP ProLiant DL380 G7, BIOS P67 05/05/2011 task: ffff880311320280 ti: ffff88030da3e000 task.ti: ffff88030da3e000 RIP: 0010:[<ffffffffa019da92>] [<ffffffffa019da92>] edac_unregister_sysfs+0x22/0x30 [edac_core] RSP: 0018:ffff88030da3fe28 EFLAGS: 00010292 RAX: 2030203020312030 RBX: ffff880311b4e000 RCX: 000000000000095c RDX: 0000000000000001 RSI: ffff880327bb9600 RDI: 0000000000000286 RBP: ffff880311b4e750 R08: 0000000000000000 R09: ffffffff81296110 R10: 0000000000000400 R11: 0000000000000000 R12: ffff88030ba1ac68 R13: 0000000000000001 R14: 00000000011b02f0 R15: 0000000000000000 FS: 00007fc9bf8f5700(0000) GS:ffff8801a7c40000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b CR2: 0000000000403c90 CR3: 000000019ebdf000 CR4: 00000000000007e0 Stack: Call Trace: i7core_unregister_mci.isra.9 i7core_remove pci_device_remove __device_release_driver driver_detach bus_remove_driver pci_unregister_driver i7core_exit SyS_delete_module system_call_fastpath 0x7fc9bf426536 Code: 2e 0f 1f 84 00 00 00 00 00 66 66 66 66 90 53 48 89 fb e8 52 2a 1f e1 48 8b bb a0 02 00 00 e8 46 59 1f e1 48 8b 83 a0 02 00 00 5b <48> 8b 38 e9 26 9a fe e0 66 0f 1f 44 00 00 66 66 66 66 90 48 8b RIP [<ffffffffa019da92>] edac_unregister_sysfs+0x22/0x30 [edac_core] RSP <ffff88030da3fe28> Signed-off-by: Borislav Petkov <bp@suse.de> Cc: Mauro Carvalho Chehab <mchehab@osg.samsung.com> Cc: <stable@vger.kernel.org> # v3.6.. Fixes: 7a623c0 ("edac: rewrite the sysfs code to use struct device")

This driver registers for extcon events as part of its probe, but never unregisters them in case of error in the probe path. There were multiple issues noticed due to this missing error handling. One of them is random crashes if the regulators are not ready yet by the time probe is invoked. Ivan's previous attempt [1] to fix this issue, did not really address all the failure cases like regualtor failures. [1] https://lkml.org/lkml/2015/9/7/62 Without this patch the kernel would carsh with log: ... Unable to handle kernel paging request at virtual address 17d78410 pgd = ffffffc001a5c000 [17d78410] *pgd=00000000b6806003, *pud=00000000b6806003, *pmd=0000000000000000 Internal error: Oops: 96000005 [#1] PREEMPT SMP Modules linked in: CPU: 0 PID: 6 Comm: kworker/u8:0 Not tainted 4.4.0+ torvalds#48 Hardware name: Qualcomm Technologies, Inc. APQ 8016 SBC (DT) Workqueue: deferwq deferred_probe_work_func task: ffffffc03686e900 ti: ffffffc0368b0000 task.ti: ffffffc0368b0000 PC is at raw_notifier_chain_register+0x1c/0x44 LR is at extcon_register_notifier+0x88/0xc8 pc : [<ffffffc0000da43c>] lr : [<ffffffc000606298>] pstate: 80000085 sp : ffffffc0368b3a70 x29: ffffffc0368b3a70 x28: ffffffc03680c310 x27: ffffffc035518000 x26: ffffffc035518000 x25: ffffffc03bfa20e0 x24: ffffffc035580a18 x23: 0000000000000000 x22: ffffffc035518458 x21: ffffffc0355e9a60 x20: ffffffc035518000 x19: 0000000000000000 x18: 0000000000000028 x17: 0000000000000003 x16: ffffffc0018153c8 x15: 0000000000000001 x14: ffffffc03686f0f8 x13: ffffffc03686f0f8 x12: 0000000000000003 x11: 0000000000000001 x10: 0000000000000001 x9 : ffffffc03686f0f8 x8 : 0000e3872014c1a1 x7 : 0000000000000028 x6 : 0000000000000000 x5 : 0000000000000001 x4 : 0000000000000000 x3 : 00000000354fb170 x2 : 0000000017d78400 x1 : ffffffc0355e9a60 x0 : ffffffc0354fb268 Signed-off-by: Srinivas Kandagatla <srinivas.kandagatla@linaro.org>

This driver registers for extcon events as part of its probe, but never unregisters them in case of error in the probe path. There were multiple issues noticed due to this missing error handling. One of them is random crashes if the regulators are not ready yet by the time probe is invoked. Ivan's previous attempt [1] to fix this issue, did not really address all the failure cases like regualtor/get_irq failures. [1] https://lkml.org/lkml/2015/9/7/62 Without this patch the kernel would carsh with log: ... Unable to handle kernel paging request at virtual address 17d78410 pgd = ffffffc001a5c000 [17d78410] *pgd=00000000b6806003, *pud=00000000b6806003, *pmd=0000000000000000 Internal error: Oops: 96000005 [#1] PREEMPT SMP Modules linked in: CPU: 0 PID: 6 Comm: kworker/u8:0 Not tainted 4.4.0+ torvalds#48 Hardware name: Qualcomm Technologies, Inc. APQ 8016 SBC (DT) Workqueue: deferwq deferred_probe_work_func task: ffffffc03686e900 ti: ffffffc0368b0000 task.ti: ffffffc0368b0000 PC is at raw_notifier_chain_register+0x1c/0x44 LR is at extcon_register_notifier+0x88/0xc8 pc : [<ffffffc0000da43c>] lr : [<ffffffc000606298>] pstate: 80000085 sp : ffffffc0368b3a70 x29: ffffffc0368b3a70 x28: ffffffc03680c310 x27: ffffffc035518000 x26: ffffffc035518000 x25: ffffffc03bfa20e0 x24: ffffffc035580a18 x23: 0000000000000000 x22: ffffffc035518458 x21: ffffffc0355e9a60 x20: ffffffc035518000 x19: 0000000000000000 x18: 0000000000000028 x17: 0000000000000003 x16: ffffffc0018153c8 x15: 0000000000000001 x14: ffffffc03686f0f8 x13: ffffffc03686f0f8 x12: 0000000000000003 x11: 0000000000000001 x10: 0000000000000001 x9 : ffffffc03686f0f8 x8 : 0000e3872014c1a1 x7 : 0000000000000028 x6 : 0000000000000000 x5 : 0000000000000001 x4 : 0000000000000000 x3 : 00000000354fb170 x2 : 0000000017d78400 x1 : ffffffc0355e9a60 x0 : ffffffc0354fb268 Fixes: 591fc11 ("usb: phy: msm: Use extcon framework for VBUS and ID detection") CC: Stable <stable@vger.kernel.org> Signed-off-by: Srinivas Kandagatla <srinivas.kandagatla@linaro.org>

the returned buffer of register_sysctl() is stored into net_header variable, but net_header is not used after, and compiler maybe optimise the variable out, and lead kmemleak reported the below warning comm "swapper/0", pid 1, jiffies 4294937448 (age 267.270s) hex dump (first 32 bytes): 90 38 8b 01 c0 ff ff ff 00 00 00 00 01 00 00 00 .8.............. 01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ backtrace: [<ffffffc00020f134>] create_object+0x10c/0x2a0 [<ffffffc00070ff44>] kmemleak_alloc+0x54/0xa0 [<ffffffc0001fe378>] __kmalloc+0x1f8/0x4f8 [<ffffffc00028e984>] __register_sysctl_table+0x64/0x5a0 [<ffffffc00028eef0>] register_sysctl+0x30/0x40 [<ffffffc00099c304>] net_sysctl_init+0x20/0x58 [<ffffffc000994dd8>] sock_init+0x10/0xb0 [<ffffffc0000842e0>] do_one_initcall+0x90/0x1b8 [<ffffffc000966bac>] kernel_init_freeable+0x218/0x2f0 [<ffffffc00070ed6c>] kernel_init+0x1c/0xe8 [<ffffffc000083bfc>] ret_from_fork+0xc/0x50 [<ffffffffffffffff>] 0xffffffffffffffff <<end check kmemleak>> Before fix, the objdump result on ARM64: 0000000000000000 <net_sysctl_init>: 0: a9be7bfd stp x29, x30, [sp,#-32]! 4: 90000001 adrp x1, 0 <net_sysctl_init> 8: 90000000 adrp x0, 0 <net_sysctl_init> c: 910003fd mov x29, sp 10: 91000021 add x1, x1, #0x0 14: 91000000 add x0, x0, #0x0 18: a90153f3 stp x19, x20, [sp,openbmc#16] 1c: 12800174 mov w20, #0xfffffff4 // #-12 20: 94000000 bl 0 <register_sysctl> 24: b4000120 cbz x0, 48 <net_sysctl_init+0x48> 28: 90000013 adrp x19, 0 <net_sysctl_init> 2c: 91000273 add x19, x19, #0x0 30: 9101a260 add x0, x19, #0x68 34: 94000000 bl 0 <register_pernet_subsys> 38: 2a0003f4 mov w20, w0 3c: 35000060 cbnz w0, 48 <net_sysctl_init+0x48> 40: aa1303e0 mov x0, x19 44: 94000000 bl 0 <register_sysctl_root> 48: 2a1403e0 mov w0, w20 4c: a94153f3 ldp x19, x20, [sp,openbmc#16] 50: a8c27bfd ldp x29, x30, [sp],openbmc#32 54: d65f03c0 ret After: 0000000000000000 <net_sysctl_init>: 0: a9bd7bfd stp x29, x30, [sp,#-48]! 4: 90000000 adrp x0, 0 <net_sysctl_init> 8: 910003fd mov x29, sp c: a90153f3 stp x19, x20, [sp,openbmc#16] 10: 90000013 adrp x19, 0 <net_sysctl_init> 14: 91000000 add x0, x0, #0x0 18: 91000273 add x19, x19, #0x0 1c: f90013f5 str x21, [sp,openbmc#32] 20: aa1303e1 mov x1, x19 24: 12800175 mov w21, #0xfffffff4 // #-12 28: 94000000 bl 0 <register_sysctl> 2c: f9002260 str x0, [x19,openbmc#64] 30: b40001a0 cbz x0, 64 <net_sysctl_init+0x64> 34: 90000014 adrp x20, 0 <net_sysctl_init> 38: 91000294 add x20, x20, #0x0 3c: 9101a280 add x0, x20, #0x68 40: 94000000 bl 0 <register_pernet_subsys> 44: 2a0003f5 mov w21, w0 48: 35000080 cbnz w0, 58 <net_sysctl_init+0x58> 4c: aa1403e0 mov x0, x20 50: 94000000 bl 0 <register_sysctl_root> 54: 14000004 b 64 <net_sysctl_init+0x64> 58: f9402260 ldr x0, [x19,openbmc#64] 5c: 94000000 bl 0 <unregister_sysctl_table> 60: f900227f str xzr, [x19,openbmc#64] 64: 2a1503e0 mov w0, w21 68: f94013f5 ldr x21, [sp,openbmc#32] 6c: a94153f3 ldp x19, x20, [sp,openbmc#16] 70: a8c37bfd ldp x29, x30, [sp],openbmc#48 74: d65f03c0 ret Add the possible error handle to free the net_header to remove the kmemleak warning Signed-off-by: Li RongQing <roy.qing.li@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>

This driver registers for extcon events as part of its probe, but never unregisters them in case of error in the probe path. There were multiple issues noticed due to this missing error handling. One of them is random crashes if the regulators are not ready yet by the time probe is invoked. Ivan's previous attempt [1] to fix this issue, did not really address all the failure cases like regualtor/get_irq failures. [1] https://lkml.org/lkml/2015/9/7/62 Without this patch the kernel would carsh with log: ... Unable to handle kernel paging request at virtual address 17d78410 pgd = ffffffc001a5c000 [17d78410] *pgd=00000000b6806003, *pud=00000000b6806003, *pmd=0000000000000000 Internal error: Oops: 96000005 [#1] PREEMPT SMP Modules linked in: CPU: 0 PID: 6 Comm: kworker/u8:0 Not tainted 4.4.0+ torvalds#48 Hardware name: Qualcomm Technologies, Inc. APQ 8016 SBC (DT) Workqueue: deferwq deferred_probe_work_func task: ffffffc03686e900 ti: ffffffc0368b0000 task.ti: ffffffc0368b0000 PC is at raw_notifier_chain_register+0x1c/0x44 LR is at extcon_register_notifier+0x88/0xc8 pc : [<ffffffc0000da43c>] lr : [<ffffffc000606298>] pstate: 80000085 sp : ffffffc0368b3a70 x29: ffffffc0368b3a70 x28: ffffffc03680c310 x27: ffffffc035518000 x26: ffffffc035518000 x25: ffffffc03bfa20e0 x24: ffffffc035580a18 x23: 0000000000000000 x22: ffffffc035518458 x21: ffffffc0355e9a60 x20: ffffffc035518000 x19: 0000000000000000 x18: 0000000000000028 x17: 0000000000000003 x16: ffffffc0018153c8 x15: 0000000000000001 x14: ffffffc03686f0f8 x13: ffffffc03686f0f8 x12: 0000000000000003 x11: 0000000000000001 x10: 0000000000000001 x9 : ffffffc03686f0f8 x8 : 0000e3872014c1a1 x7 : 0000000000000028 x6 : 0000000000000000 x5 : 0000000000000001 x4 : 0000000000000000 x3 : 00000000354fb170 x2 : 0000000017d78400 x1 : ffffffc0355e9a60 x0 : ffffffc0354fb268 Fixes: 591fc11 ("usb: phy: msm: Use extcon framework for VBUS and ID detection") CC: Stable <stable@vger.kernel.org> Signed-off-by: Srinivas Kandagatla <srinivas.kandagatla@linaro.org> Signed-off-by: Felipe Balbi <balbi@kernel.org>

This driver registers for extcon events as part of its probe, but never unregisters them in case of error in the probe path. There were multiple issues noticed due to this missing error handling. One of them is random crashes if the regulators are not ready yet by the time probe is invoked. Ivan's previous attempt [1] to fix this issue, did not really address all the failure cases like regualtor failures. [1] https://lkml.org/lkml/2015/9/7/62 Without this patch the kernel would carsh with log: ... Unable to handle kernel paging request at virtual address 17d78410 pgd = ffffffc001a5c000 [17d78410] *pgd=00000000b6806003, *pud=00000000b6806003, *pmd=0000000000000000 Internal error: Oops: 96000005 [#1] PREEMPT SMP Modules linked in: CPU: 0 PID: 6 Comm: kworker/u8:0 Not tainted 4.4.0+ torvalds#48 Hardware name: Qualcomm Technologies, Inc. APQ 8016 SBC (DT) Workqueue: deferwq deferred_probe_work_func task: ffffffc03686e900 ti: ffffffc0368b0000 task.ti: ffffffc0368b0000 PC is at raw_notifier_chain_register+0x1c/0x44 LR is at extcon_register_notifier+0x88/0xc8 pc : [<ffffffc0000da43c>] lr : [<ffffffc000606298>] pstate: 80000085 sp : ffffffc0368b3a70 x29: ffffffc0368b3a70 x28: ffffffc03680c310 x27: ffffffc035518000 x26: ffffffc035518000 x25: ffffffc03bfa20e0 x24: ffffffc035580a18 x23: 0000000000000000 x22: ffffffc035518458 x21: ffffffc0355e9a60 x20: ffffffc035518000 x19: 0000000000000000 x18: 0000000000000028 x17: 0000000000000003 x16: ffffffc0018153c8 x15: 0000000000000001 x14: ffffffc03686f0f8 x13: ffffffc03686f0f8 x12: 0000000000000003 x11: 0000000000000001 x10: 0000000000000001 x9 : ffffffc03686f0f8 x8 : 0000e3872014c1a1 x7 : 0000000000000028 x6 : 0000000000000000 x5 : 0000000000000001 x4 : 0000000000000000 x3 : 00000000354fb170 x2 : 0000000017d78400 x1 : ffffffc0355e9a60 x0 : ffffffc0354fb268 Signed-off-by: Srinivas Kandagatla <srinivas.kandagatla@linaro.org>

[ Upstream commit 12e2696 ] I get the splat below when modprobing/rmmoding EDAC drivers. It happens because bus->name is invalid after bus_unregister() has run. The Code: section below corresponds to: .loc 1 1108 0 movq 672(%rbx), %rax # mci_1(D)->bus, mci_1(D)->bus .loc 1 1109 0 popq %rbx # .loc 1 1108 0 movq (%rax), %rdi # _7->name, jmp kfree # and %rax has some funky stuff 2030203020312030 which looks a lot like something walked over it. Fix that by saving the name ptr before doing stuff to string it points to. general protection fault: 0000 [#1] SMP Modules linked in: ... CPU: 4 PID: 10318 Comm: modprobe Tainted: G I EN 3.12.51-11-default+ torvalds#48 Hardware name: HP ProLiant DL380 G7, BIOS P67 05/05/2011 task: ffff880311320280 ti: ffff88030da3e000 task.ti: ffff88030da3e000 RIP: 0010:[<ffffffffa019da92>] [<ffffffffa019da92>] edac_unregister_sysfs+0x22/0x30 [edac_core] RSP: 0018:ffff88030da3fe28 EFLAGS: 00010292 RAX: 2030203020312030 RBX: ffff880311b4e000 RCX: 000000000000095c RDX: 0000000000000001 RSI: ffff880327bb9600 RDI: 0000000000000286 RBP: ffff880311b4e750 R08: 0000000000000000 R09: ffffffff81296110 R10: 0000000000000400 R11: 0000000000000000 R12: ffff88030ba1ac68 R13: 0000000000000001 R14: 00000000011b02f0 R15: 0000000000000000 FS: 00007fc9bf8f5700(0000) GS:ffff8801a7c40000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b CR2: 0000000000403c90 CR3: 000000019ebdf000 CR4: 00000000000007e0 Stack: Call Trace: i7core_unregister_mci.isra.9 i7core_remove pci_device_remove __device_release_driver driver_detach bus_remove_driver pci_unregister_driver i7core_exit SyS_delete_module system_call_fastpath 0x7fc9bf426536 Code: 2e 0f 1f 84 00 00 00 00 00 66 66 66 66 90 53 48 89 fb e8 52 2a 1f e1 48 8b bb a0 02 00 00 e8 46 59 1f e1 48 8b 83 a0 02 00 00 5b <48> 8b 38 e9 26 9a fe e0 66 0f 1f 44 00 00 66 66 66 66 90 48 8b RIP [<ffffffffa019da92>] edac_unregister_sysfs+0x22/0x30 [edac_core] RSP <ffff88030da3fe28> Signed-off-by: Borislav Petkov <bp@suse.de> Cc: Mauro Carvalho Chehab <mchehab@osg.samsung.com> Cc: <stable@vger.kernel.org> # v3.6.. Fixes: 7a623c0 ("edac: rewrite the sysfs code to use struct device") Signed-off-by: Sasha Levin <sasha.levin@oracle.com>

commit a38a08d upstream. This driver registers for extcon events as part of its probe, but never unregisters them in case of error in the probe path. There were multiple issues noticed due to this missing error handling. One of them is random crashes if the regulators are not ready yet by the time probe is invoked. Ivan's previous attempt [1] to fix this issue, did not really address all the failure cases like regualtor/get_irq failures. [1] https://lkml.org/lkml/2015/9/7/62 Without this patch the kernel would carsh with log: ... Unable to handle kernel paging request at virtual address 17d78410 pgd = ffffffc001a5c000 [17d78410] *pgd=00000000b6806003, *pud=00000000b6806003, *pmd=0000000000000000 Internal error: Oops: 96000005 [#1] PREEMPT SMP Modules linked in: CPU: 0 PID: 6 Comm: kworker/u8:0 Not tainted 4.4.0+ torvalds#48 Hardware name: Qualcomm Technologies, Inc. APQ 8016 SBC (DT) Workqueue: deferwq deferred_probe_work_func task: ffffffc03686e900 ti: ffffffc0368b0000 task.ti: ffffffc0368b0000 PC is at raw_notifier_chain_register+0x1c/0x44 LR is at extcon_register_notifier+0x88/0xc8 pc : [<ffffffc0000da43c>] lr : [<ffffffc000606298>] pstate: 80000085 sp : ffffffc0368b3a70 x29: ffffffc0368b3a70 x28: ffffffc03680c310 x27: ffffffc035518000 x26: ffffffc035518000 x25: ffffffc03bfa20e0 x24: ffffffc035580a18 x23: 0000000000000000 x22: ffffffc035518458 x21: ffffffc0355e9a60 x20: ffffffc035518000 x19: 0000000000000000 x18: 0000000000000028 x17: 0000000000000003 x16: ffffffc0018153c8 x15: 0000000000000001 x14: ffffffc03686f0f8 x13: ffffffc03686f0f8 x12: 0000000000000003 x11: 0000000000000001 x10: 0000000000000001 x9 : ffffffc03686f0f8 x8 : 0000e3872014c1a1 x7 : 0000000000000028 x6 : 0000000000000000 x5 : 0000000000000001 x4 : 0000000000000000 x3 : 00000000354fb170 x2 : 0000000017d78400 x1 : ffffffc0355e9a60 x0 : ffffffc0354fb268 Fixes: 591fc11 ("usb: phy: msm: Use extcon framework for VBUS and ID detection") Signed-off-by: Srinivas Kandagatla <srinivas.kandagatla@linaro.org> Signed-off-by: Felipe Balbi <balbi@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit a38a08d upstream. This driver registers for extcon events as part of its probe, but never unregisters them in case of error in the probe path. There were multiple issues noticed due to this missing error handling. One of them is random crashes if the regulators are not ready yet by the time probe is invoked. Ivan's previous attempt [1] to fix this issue, did not really address all the failure cases like regualtor/get_irq failures. [1] https://lkml.org/lkml/2015/9/7/62 Without this patch the kernel would carsh with log: ... Unable to handle kernel paging request at virtual address 17d78410 pgd = ffffffc001a5c000 [17d78410] *pgd=00000000b6806003, *pud=00000000b6806003, *pmd=0000000000000000 Internal error: Oops: 96000005 [#1] PREEMPT SMP Modules linked in: CPU: 0 PID: 6 Comm: kworker/u8:0 Not tainted 4.4.0+ #48 Hardware name: Qualcomm Technologies, Inc. APQ 8016 SBC (DT) Workqueue: deferwq deferred_probe_work_func task: ffffffc03686e900 ti: ffffffc0368b0000 task.ti: ffffffc0368b0000 PC is at raw_notifier_chain_register+0x1c/0x44 LR is at extcon_register_notifier+0x88/0xc8 pc : [<ffffffc0000da43c>] lr : [<ffffffc000606298>] pstate: 80000085 sp : ffffffc0368b3a70 x29: ffffffc0368b3a70 x28: ffffffc03680c310 x27: ffffffc035518000 x26: ffffffc035518000 x25: ffffffc03bfa20e0 x24: ffffffc035580a18 x23: 0000000000000000 x22: ffffffc035518458 x21: ffffffc0355e9a60 x20: ffffffc035518000 x19: 0000000000000000 x18: 0000000000000028 x17: 0000000000000003 x16: ffffffc0018153c8 x15: 0000000000000001 x14: ffffffc03686f0f8 x13: ffffffc03686f0f8 x12: 0000000000000003 x11: 0000000000000001 x10: 0000000000000001 x9 : ffffffc03686f0f8 x8 : 0000e3872014c1a1 x7 : 0000000000000028 x6 : 0000000000000000 x5 : 0000000000000001 x4 : 0000000000000000 x3 : 00000000354fb170 x2 : 0000000017d78400 x1 : ffffffc0355e9a60 x0 : ffffffc0354fb268 Fixes: 591fc11 ("usb: phy: msm: Use extcon framework for VBUS and ID detection") Signed-off-by: Srinivas Kandagatla <srinivas.kandagatla@linaro.org> Signed-off-by: Felipe Balbi <balbi@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

BugLink: http://bugs.launchpad.net/bugs/1540532 commit 12e2696 upstream. I get the splat below when modprobing/rmmoding EDAC drivers. It happens because bus->name is invalid after bus_unregister() has run. The Code: section below corresponds to: .loc 1 1108 0 movq 672(%rbx), %rax # mci_1(D)->bus, mci_1(D)->bus .loc 1 1109 0 popq %rbx # .loc 1 1108 0 movq (%rax), %rdi # _7->name, jmp kfree # and %rax has some funky stuff 2030203020312030 which looks a lot like something walked over it. Fix that by saving the name ptr before doing stuff to string it points to. general protection fault: 0000 [#1] SMP Modules linked in: ... CPU: 4 PID: 10318 Comm: modprobe Tainted: G I EN 3.12.51-11-default+ torvalds#48 Hardware name: HP ProLiant DL380 G7, BIOS P67 05/05/2011 task: ffff880311320280 ti: ffff88030da3e000 task.ti: ffff88030da3e000 RIP: 0010:[<ffffffffa019da92>] [<ffffffffa019da92>] edac_unregister_sysfs+0x22/0x30 [edac_core] RSP: 0018:ffff88030da3fe28 EFLAGS: 00010292 RAX: 2030203020312030 RBX: ffff880311b4e000 RCX: 000000000000095c RDX: 0000000000000001 RSI: ffff880327bb9600 RDI: 0000000000000286 RBP: ffff880311b4e750 R08: 0000000000000000 R09: ffffffff81296110 R10: 0000000000000400 R11: 0000000000000000 R12: ffff88030ba1ac68 R13: 0000000000000001 R14: 00000000011b02f0 R15: 0000000000000000 FS: 00007fc9bf8f5700(0000) GS:ffff8801a7c40000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b CR2: 0000000000403c90 CR3: 000000019ebdf000 CR4: 00000000000007e0 Stack: Call Trace: i7core_unregister_mci.isra.9 i7core_remove pci_device_remove __device_release_driver driver_detach bus_remove_driver pci_unregister_driver i7core_exit SyS_delete_module system_call_fastpath 0x7fc9bf426536 Code: 2e 0f 1f 84 00 00 00 00 00 66 66 66 66 90 53 48 89 fb e8 52 2a 1f e1 48 8b bb a0 02 00 00 e8 46 59 1f e1 48 8b 83 a0 02 00 00 5b <48> 8b 38 e9 26 9a fe e0 66 0f 1f 44 00 00 66 66 66 66 90 48 8b RIP [<ffffffffa019da92>] edac_unregister_sysfs+0x22/0x30 [edac_core] RSP <ffff88030da3fe28> Signed-off-by: Borislav Petkov <bp@suse.de> Cc: Mauro Carvalho Chehab <mchehab@osg.samsung.com> Fixes: 7a623c0 ("edac: rewrite the sysfs code to use struct device") Signed-off-by: Kamal Mostafa <kamal@canonical.com>

commit 12e2696 upstream. I get the splat below when modprobing/rmmoding EDAC drivers. It happens because bus->name is invalid after bus_unregister() has run. The Code: section below corresponds to: .loc 1 1108 0 movq 672(%rbx), %rax # mci_1(D)->bus, mci_1(D)->bus .loc 1 1109 0 popq %rbx # .loc 1 1108 0 movq (%rax), %rdi # _7->name, jmp kfree # and %rax has some funky stuff 2030203020312030 which looks a lot like something walked over it. Fix that by saving the name ptr before doing stuff to string it points to. general protection fault: 0000 [#1] SMP Modules linked in: ... CPU: 4 PID: 10318 Comm: modprobe Tainted: G I EN 3.12.51-11-default+ torvalds#48 Hardware name: HP ProLiant DL380 G7, BIOS P67 05/05/2011 task: ffff880311320280 ti: ffff88030da3e000 task.ti: ffff88030da3e000 RIP: 0010:[<ffffffffa019da92>] [<ffffffffa019da92>] edac_unregister_sysfs+0x22/0x30 [edac_core] RSP: 0018:ffff88030da3fe28 EFLAGS: 00010292 RAX: 2030203020312030 RBX: ffff880311b4e000 RCX: 000000000000095c RDX: 0000000000000001 RSI: ffff880327bb9600 RDI: 0000000000000286 RBP: ffff880311b4e750 R08: 0000000000000000 R09: ffffffff81296110 R10: 0000000000000400 R11: 0000000000000000 R12: ffff88030ba1ac68 R13: 0000000000000001 R14: 00000000011b02f0 R15: 0000000000000000 FS: 00007fc9bf8f5700(0000) GS:ffff8801a7c40000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b CR2: 0000000000403c90 CR3: 000000019ebdf000 CR4: 00000000000007e0 Stack: Call Trace: i7core_unregister_mci.isra.9 i7core_remove pci_device_remove __device_release_driver driver_detach bus_remove_driver pci_unregister_driver i7core_exit SyS_delete_module system_call_fastpath 0x7fc9bf426536 Code: 2e 0f 1f 84 00 00 00 00 00 66 66 66 66 90 53 48 89 fb e8 52 2a 1f e1 48 8b bb a0 02 00 00 e8 46 59 1f e1 48 8b 83 a0 02 00 00 5b <48> 8b 38 e9 26 9a fe e0 66 0f 1f 44 00 00 66 66 66 66 90 48 8b RIP [<ffffffffa019da92>] edac_unregister_sysfs+0x22/0x30 [edac_core] RSP <ffff88030da3fe28> Signed-off-by: Borislav Petkov <bp@suse.de> Cc: Mauro Carvalho Chehab <mchehab@osg.samsung.com> Fixes: 7a623c0 ("edac: rewrite the sysfs code to use struct device") Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit 12e2696 upstream. I get the splat below when modprobing/rmmoding EDAC drivers. It happens because bus->name is invalid after bus_unregister() has run. The Code: section below corresponds to: .loc 1 1108 0 movq 672(%rbx), %rax # mci_1(D)->bus, mci_1(D)->bus .loc 1 1109 0 popq %rbx # .loc 1 1108 0 movq (%rax), %rdi # _7->name, jmp kfree # and %rax has some funky stuff 2030203020312030 which looks a lot like something walked over it. Fix that by saving the name ptr before doing stuff to string it points to. general protection fault: 0000 [#1] SMP Modules linked in: ... CPU: 4 PID: 10318 Comm: modprobe Tainted: G I EN 3.12.51-11-default+ torvalds#48 Hardware name: HP ProLiant DL380 G7, BIOS P67 05/05/2011 task: ffff880311320280 ti: ffff88030da3e000 task.ti: ffff88030da3e000 RIP: 0010:[<ffffffffa019da92>] [<ffffffffa019da92>] edac_unregister_sysfs+0x22/0x30 [edac_core] RSP: 0018:ffff88030da3fe28 EFLAGS: 00010292 RAX: 2030203020312030 RBX: ffff880311b4e000 RCX: 000000000000095c RDX: 0000000000000001 RSI: ffff880327bb9600 RDI: 0000000000000286 RBP: ffff880311b4e750 R08: 0000000000000000 R09: ffffffff81296110 R10: 0000000000000400 R11: 0000000000000000 R12: ffff88030ba1ac68 R13: 0000000000000001 R14: 00000000011b02f0 R15: 0000000000000000 FS: 00007fc9bf8f5700(0000) GS:ffff8801a7c40000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b CR2: 0000000000403c90 CR3: 000000019ebdf000 CR4: 00000000000007e0 Stack: Call Trace: i7core_unregister_mci.isra.9 i7core_remove pci_device_remove __device_release_driver driver_detach bus_remove_driver pci_unregister_driver i7core_exit SyS_delete_module system_call_fastpath 0x7fc9bf426536 Code: 2e 0f 1f 84 00 00 00 00 00 66 66 66 66 90 53 48 89 fb e8 52 2a 1f e1 48 8b bb a0 02 00 00 e8 46 59 1f e1 48 8b 83 a0 02 00 00 5b <48> 8b 38 e9 26 9a fe e0 66 0f 1f 44 00 00 66 66 66 66 90 48 8b RIP [<ffffffffa019da92>] edac_unregister_sysfs+0x22/0x30 [edac_core] RSP <ffff88030da3fe28> Signed-off-by: Borislav Petkov <bp@suse.de> Cc: Mauro Carvalho Chehab <mchehab@osg.samsung.com> Fixes: 7a623c0 ("edac: rewrite the sysfs code to use struct device") Signed-off-by: Jiri Slaby <jslaby@suse.cz>

The toplogy update is performed by the AP via smp_callin() after the BSP has called do_wait_cpu_initialized(), setting the AP's bit in cpu_callout_mask to allow it to proceed. In preparation to enable further parallelism of AP bringup, add locking to serialize the update even if multiple APs are (in future) permitted to proceed through the next stages of bringup in parallel. Without such ordering (and with that future extra parallelism), confusion ensues: [ 1.360149] x86: Booting SMP configuration: [ 1.360221] .... node #0, CPUs: #1 #2 #3 #4 #5 torvalds#6 torvalds#7 torvalds#8 torvalds#9 torvalds#10 torvalds#11 torvalds#12 torvalds#13 torvalds#14 torvalds#15 torvalds#16 torvalds#17 torvalds#18 torvalds#19 torvalds#20 torvalds#21 torvalds#22 torvalds#23 [ 1.366225] .... node #1, CPUs: torvalds#24 torvalds#25 torvalds#26 torvalds#27 torvalds#28 torvalds#29 torvalds#30 torvalds#31 torvalds#32 torvalds#33 torvalds#34 torvalds#35 torvalds#36 torvalds#37 torvalds#38 torvalds#39 torvalds#40 torvalds#41 torvalds#42 torvalds#43 torvalds#44 torvalds#45 torvalds#46 torvalds#47 [ 1.370219] .... node #0, CPUs: torvalds#48 torvalds#49 torvalds#50 torvalds#51 #52 #53 torvalds#54 torvalds#55 torvalds#56 torvalds#57 #58 torvalds#59 torvalds#60 torvalds#61 torvalds#62 torvalds#63 torvalds#64 torvalds#65 torvalds#66 torvalds#67 torvalds#68 torvalds#69 #70 torvalds#71 [ 1.378226] .... node #1, CPUs: torvalds#72 torvalds#73 torvalds#74 torvalds#75 torvalds#76 torvalds#77 torvalds#78 torvalds#79 torvalds#80 torvalds#81 torvalds#82 torvalds#83 torvalds#84 torvalds#85 torvalds#86 torvalds#87 torvalds#88 torvalds#89 torvalds#90 torvalds#91 torvalds#92 torvalds#93 torvalds#94 torvalds#95 [ 1.382037] Brought 96 CPUs to x86/cpu:kick in 72232606 cycles [ 0.104104] smpboot: CPU 26 Converting physical 0 to logical die 1 [ 0.104104] smpboot: CPU 27 Converting physical 1 to logical package 2 [ 0.104104] smpboot: CPU 24 Converting physical 1 to logical package 3 [ 0.104104] smpboot: CPU 27 Converting physical 0 to logical die 2 [ 0.104104] smpboot: CPU 25 Converting physical 1 to logical package 4 [ 1.385609] Brought 96 CPUs to x86/cpu:wait-init in 9269218 cycles [ 1.395285] Brought CPUs online in 28930764 cycles [ 1.395469] smp: Brought up 2 nodes, 96 CPUs [ 1.395689] smpboot: Max logical packages: 2 [ 1.396222] smpboot: Total of 96 processors activated (576000.00 BogoMIPS) [Usama Arif: fixed rebase conflict] Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Signed-off-by: Usama Arif <usama.arif@bytedance.com>

A soft lockup issue will be triggered when do fio test on a 448-core server, such as the following warning: [ 519.743511] watchdog: BUG: soft lockup - CPU#305 stuck for 144s! [fio:11226] [ 519.744834] CPU: 305 PID: 11226 Comm: fio Kdump: loaded Tainted: G L 6.3.0 torvalds#48 [ 519.744850] Hardware name: Lenovo ThinkSystem SR645 V3 MB,Genoa,DDR5,Oahu,1U/SB27B31174, BIOS KAE111F-2.10 04/17/2023 [ 519.744867] RIP: 0010:__do_softirq+0x93/0x2d3 [ 519.744901] Code: c3 c0 5a ff c7 44 24 08 0a 00 00 00 48 c7 c7 45 6b c5 a0 e8 1f ac fe ff 65 66 c7 05 dd 76 a8 5f 00 00 fb 49 c7 c2 c0 60 20 a1 <eb> 70 48 98 48 c7 c7 f8 fc c5 a0 4d 8d 3c c2 4c 89 fd 48 81 ed c0 [ 519.744916] RSP: 0018:ff6e423a8b584fa0 EFLAGS: 00000202 [ 519.744944] RAX: 0000000000000131 RBX: ff3143f856cd3f80 RCX: 0000000000000000 [ 519.744959] RDX: 000000562c5e4652 RSI: ffffffffa0c56b45 RDI: ffffffffa0c35ab3 [ 519.744974] RBP: 0000000000000000 R08: 0000000000000b01 R09: 0000000000031600 [ 519.744988] R10: ffffffffa12060c0 R11: 00000054ea7837c0 R12: 0000000000000080 [ 519.745003] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000 [ 519.745017] FS: 00007fa41bb9e000(0000) GS:ff3143f7af840000(0000) knlGS:0000000000000000 [ 519.745032] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 519.745046] CR2: 00007fa3fc05b000 CR3: 00000008d6f56004 CR4: 0000000000771ee0 [ 519.745060] PKRU: 55555554 [ 519.745075] Call Trace: [ 519.745095] <IRQ> [ 519.745160] ? ftrace_regs_caller_end+0x61/0x61 [ 519.745185] irq_exit_rcu+0xcd/0x100 [ 519.745217] sysvec_apic_timer_interrupt+0xa2/0xd0 [ 519.745251] </IRQ> [ 519.745265] <TASK> [ 519.745292] asm_sysvec_apic_timer_interrupt+0x1a/0x20 [ 519.745318] RIP: 0010:_raw_spin_unlock_irqrestore+0x1d/0x40 [ 519.745339] Code: 90 90 90 90 90 90 90 90 90 90 90 90 90 f3 0f 1e fa e8 77 9b 0b 20 c6 07 00 0f 1f 00 f7 c6 00 02 00 00 74 01 fb bf 01 00 00 00 <e8> be 65 58 ff 65 8b 05 9f 81 a8 5f 85 c0 74 05 c3 cc cc cc cc 0f [ 519.745354] RSP: 0018:ff6e423b244279f0 EFLAGS: 00000206 [ 519.745381] RAX: 000000098b346200 RBX: ff3143f0f0fccc00 RCX: 0000000000000286 [ 519.745396] RDX: 0000000000000700 RSI: 0000000000000286 RDI: 0000000000000001 [ 519.745410] RBP: ff3143f835e5c600 R08: 000000058bb02000 R09: 0000000000000000 [ 519.745424] R10: 00000107d08ab6ea R11: 0000000000000800 R12: 0000000000000820 [ 519.745438] R13: ff3143f835e5c610 R14: ff6e423b24427a50 R15: ff3143f90b346200 [ 519.745576] dma_pool_alloc+0x184/0x200 [ 519.745647] nvme_prep_rq.part.58+0x416/0x840 [nvme] [ 519.745760] nvme_queue_rq+0x7b/0x210 [nvme] [ 519.745821] __blk_mq_try_issue_directly+0x153/0x1c0 [ 519.745906] blk_mq_try_issue_directly+0x15/0x50 [ 519.745935] blk_mq_submit_bio+0x4c4/0x510 [ 519.746047] submit_bio_noacct_nocheck+0x331/0x370 [ 519.746135] blkdev_direct_IO.part.22+0x138/0x5b0 [ 519.746165] ? __fsnotify_parent+0x119/0x350 [ 519.746274] blkdev_read_iter+0xe3/0x170 [ 519.746325] aio_read+0xf6/0x1b0 [ 519.746423] ? 0xffffffffc066309b [ 519.746567] ? io_submit_one+0x18c/0xd60 [ 519.746587] ? aio_read+0x5/0x1b0 [ 519.746608] io_submit_one+0x18c/0xd60 [ 519.746672] ? __x64_sys_io_submit+0x4/0x180 [ 519.746794] ? __x64_sys_io_submit+0x93/0x180 [ 519.746815] __x64_sys_io_submit+0x93/0x180 [ 519.746838] ? __pfx___x64_sys_io_submit+0x10/0x10 [ 519.746923] do_syscall_64+0x3b/0x90 [ 519.746958] entry_SYSCALL_64_after_hwframe+0x72/0xdc [ 519.746983] RIP: 0033:0x7fa41b83ee5d [ 519.747009] Code: ff c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 93 af 1b 00 f7 d8 64 89 01 48 [ 519.747024] RSP: 002b:00007fff698e8cc8 EFLAGS: 00000246 ORIG_RAX: 00000000000000d1 [ 519.747053] RAX: ffffffffffffffda RBX: 00007fa41bb9df70 RCX: 00007fa41b83ee5d [ 519.747067] RDX: 000055a53098ba68 RSI: 0000000000000001 RDI: 00007fa3fc05b000 [ 519.747082] RBP: 00007fa3fc05b000 R08: 0000000000000000 R09: 0000000000000048 [ 519.747095] R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000001 [ 519.747110] R13: 0000000000000000 R14: 000055a53098ba68 R15: 000055a530989440 [ 519.747245] </TASK> and we have collected the corresponding ftrace log, 515.948558 | 305) | nvme_irq [nvme]() { +--1503 lines: 515.948560 | 305) | blk_mq_complete_request_remote() {--- 515.948995 | 305) | nvme_pci_complete_batch [nvme]() { 515.948996 | 305) | nvme_unmap_data [nvme]() { +-- 3 lines: 515.948996 | 305) | dma_unmap_sg_attrs() {--- 515.948997 | 305) | dma_pool_free() { 515.948997 | 305) | _raw_spin_lock_irqsave() { 515.948997 | 305) 0.190 us | preempt_count_add(); 515.948999 | 305) * 14893.70 us | native_queued_spin_lock_slowpath(); 515.963893 | 305) * 14895.59 us | } 515.963929 | 305) | _raw_spin_unlock_irqrestore() { 515.963930 | 305) 0.160 us | preempt_count_sub(); 515.963930 | 305) 1.011 us | } 515.963931 | 305) * 14933.71 us | } +-- 9 lines: 515.963931 | 305) | mempool_free() {--- 515.963933 | 305) * 14937.45 us | } 515.963933 | 305) 0.160 us | nvme_complete_batch_req [nvme_core](); 515.963934 | 305) | nvme_unmap_data [nvme]() { +-- 3 lines: 515.963934 | 305) | dma_unmap_sg_attrs() {--- 515.963935 | 305) | dma_pool_free() { 515.963935 | 305) | _raw_spin_lock_irqsave() { 515.963936 | 305) 0.170 us | preempt_count_add(); 515.963936 | 305) * 19865.82 us | native_queued_spin_lock_slowpath(); 515.983803 | 305) * 19867.21 us | } 515.983833 | 305) | _raw_spin_unlock_irqrestore() { 515.983833 | 305) 0.200 us | preempt_count_sub(); 515.983834 | 305) 1.312 us | } 515.983834 | 305) * 19898.71 us | } +--- 11 lines: 515.983834 | 305) | mempool_free() {--- 515.983837 | 305) * 19902.73 us | } 515.983837 | 305) 0.170 us | nvme_complete_batch_req [nvme_core](); +--40055 lines: 515.983837 | 305) | nvme_unmap_data [nvme]() {--- 520.816867 | 305) $ 4867871 us | } 520.816867 | 305) $ 4868309 us | } 520.816868 | 305) $ 4868310 us | } According to the above two logs, we can know the nvme_irq() cost too much time, in the above case, about 4.8 second. And we can also know that the main bottlenecks is in the competition for the spin lock pool->lock. In order to avoid the lockup log, we can enable the nvme irq threading by adding nvme.use_threaded_interrupts=1 in cmdline and with preempt=full, but for the kernel with preempt=off, the nvme.use_threaded_interrupts=1 will fail. So, we can add cond_resched() in the loops of the nvme_complete_batch(). Reviewed-by: Adrian Huang <ahuang12@lenovo.com> Signed-off-by: Jiwei Sun <sunjw10@lenovo.com>

Occasionally, with './test_progs -j' on my vm, I will hit the following failure: test_cgrp_local_storage:PASS:join_cgroup /cgrp_local_storage 0 nsec test_cgroup_iter_sleepable:PASS:skel_open 0 nsec test_cgroup_iter_sleepable:PASS:skel_load 0 nsec test_cgroup_iter_sleepable:PASS:attach_iter 0 nsec test_cgroup_iter_sleepable:PASS:iter_create 0 nsec test_cgroup_iter_sleepable:FAIL:cgroup_id unexpected cgroup_id: actual 1 != expected 2812 torvalds#48/5 cgrp_local_storage/cgroup_iter_sleepable:FAIL torvalds#48 cgrp_local_storage:FAIL Finally, I decided to do some investigation since the test is introduced by myself. It turns out the reason is due to cgroup_fd with value 0. In cgroup_iter, a cgroup_fd of value 0 means the root cgroup. /* from cgroup_iter.c */ if (fd) cgrp = cgroup_v1v2_get_from_fd(fd); else if (id) cgrp = cgroup_get_from_id(id); else /* walk the entire hierarchy by default. */ cgrp = cgroup_get_from_path("/"); That is why we got cgroup_id 1 instead of expected 2812. Why we got a cgroup_fd 0? Nobody should really touch 'stdin' (fd 0) in test_progs. I traced 'close' syscall with stack trace and found the root cause, which is a bug in bpf_obj_pinning.c. Basically, the code closed fd 0 although it should not. Fixing the bug in bpf_obj_pinning.c also resolved the above cgroup_iter_sleepable subtest failure. Fixes: 3b22f98 ("selftests/bpf: Add path_fd-based BPF_OBJ_PIN and BPF_OBJ_GET tests") Signed-off-by: Yonghong Song <yonghong.song@linux.dev>

If you submit a testing request to syzkaller it may reply with: syzbot has tested the proposed patch and the reproducer did not \ trigger any issue: Reported-and-tested-by: syzbot+f817490f5bd20541b90a@syzkaller.appspotmail.com Checkpatch is ok with Reported-by tags, but complains about Reported-and-tested-by tags: WARNING: Possible unwrapped commit description (prefer a maximum 75 \ chars per line) torvalds#48: Reported-and-tested-by: syzbot+f817490f5bd20541b90a@syzkaller.appspotmail.com Adding the new tag to signature_tags removes this warning. Signed-off-by: Andrew Kanner <andrew.kanner@gmail.com>

Occasionally, with './test_progs -j' on my vm, I will hit the following failure: test_cgrp_local_storage:PASS:join_cgroup /cgrp_local_storage 0 nsec test_cgroup_iter_sleepable:PASS:skel_open 0 nsec test_cgroup_iter_sleepable:PASS:skel_load 0 nsec test_cgroup_iter_sleepable:PASS:attach_iter 0 nsec test_cgroup_iter_sleepable:PASS:iter_create 0 nsec test_cgroup_iter_sleepable:FAIL:cgroup_id unexpected cgroup_id: actual 1 != expected 2812 torvalds#48/5 cgrp_local_storage/cgroup_iter_sleepable:FAIL torvalds#48 cgrp_local_storage:FAIL Finally, I decided to do some investigation since the test is introduced by myself. It turns out the reason is due to cgroup_fd with value 0. In cgroup_iter, a cgroup_fd of value 0 means the root cgroup. /* from cgroup_iter.c */ if (fd) cgrp = cgroup_v1v2_get_from_fd(fd); else if (id) cgrp = cgroup_get_from_id(id); else /* walk the entire hierarchy by default. */ cgrp = cgroup_get_from_path("/"); That is why we got cgroup_id 1 instead of expected 2812. Why we got a cgroup_fd 0? Nobody should really touch 'stdin' (fd 0) in test_progs. I traced 'close' syscall with stack trace and found the root cause, which is a bug in bpf_obj_pinning.c. Basically, the code closed fd 0 although it should not. Fixing the bug in bpf_obj_pinning.c also resolved the above cgroup_iter_sleepable subtest failure. Fixes: 3b22f98 ("selftests/bpf: Add path_fd-based BPF_OBJ_PIN and BPF_OBJ_GET tests") Signed-off-by: Yonghong Song <yonghong.song@linux.dev> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20230827150551.1743497-1-yonghong.song@linux.dev

[ Upstream commit 5439cfa ] Occasionally, with './test_progs -j' on my vm, I will hit the following failure: test_cgrp_local_storage:PASS:join_cgroup /cgrp_local_storage 0 nsec test_cgroup_iter_sleepable:PASS:skel_open 0 nsec test_cgroup_iter_sleepable:PASS:skel_load 0 nsec test_cgroup_iter_sleepable:PASS:attach_iter 0 nsec test_cgroup_iter_sleepable:PASS:iter_create 0 nsec test_cgroup_iter_sleepable:FAIL:cgroup_id unexpected cgroup_id: actual 1 != expected 2812 torvalds#48/5 cgrp_local_storage/cgroup_iter_sleepable:FAIL torvalds#48 cgrp_local_storage:FAIL Finally, I decided to do some investigation since the test is introduced by myself. It turns out the reason is due to cgroup_fd with value 0. In cgroup_iter, a cgroup_fd of value 0 means the root cgroup. /* from cgroup_iter.c */ if (fd) cgrp = cgroup_v1v2_get_from_fd(fd); else if (id) cgrp = cgroup_get_from_id(id); else /* walk the entire hierarchy by default. */ cgrp = cgroup_get_from_path("/"); That is why we got cgroup_id 1 instead of expected 2812. Why we got a cgroup_fd 0? Nobody should really touch 'stdin' (fd 0) in test_progs. I traced 'close' syscall with stack trace and found the root cause, which is a bug in bpf_obj_pinning.c. Basically, the code closed fd 0 although it should not. Fixing the bug in bpf_obj_pinning.c also resolved the above cgroup_iter_sleepable subtest failure. Fixes: 3b22f98 ("selftests/bpf: Add path_fd-based BPF_OBJ_PIN and BPF_OBJ_GET tests") Signed-off-by: Yonghong Song <yonghong.song@linux.dev> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20230827150551.1743497-1-yonghong.song@linux.dev Signed-off-by: Sasha Levin <sashal@kernel.org>

If you submit a testing request to syzkaller it may reply with: > syzbot has tested the proposed patch and the reproducer did not \ > trigger any issue: > > Reported-and-tested-by: syzbot+f817490f5bd20541b90a@syzkaller.appspotmail.com Checkpatch is ok with Reported-by tags, but complains about Reported-and-tested-by tags: > WARNING: Possible unwrapped commit description (prefer a maximum 75 \ > chars per line) > torvalds#48: > Reported-and-tested-by: syzbot+f817490f5bd20541b90a@syzkaller.appspotmail.com Adding the new tag to signature_tags removes this warning. Signed-off-by: Andrew Kanner <andrew.kanner@gmail.com>

Expanding the test coverage from cgroup2 to include cgroup1. The result as follows, Already existing test cases for cgroup2: torvalds#48/1 cgrp_local_storage/tp_btf:OK torvalds#48/2 cgrp_local_storage/attach_cgroup:OK torvalds#48/3 cgrp_local_storage/recursion:OK torvalds#48/4 cgrp_local_storage/negative:OK torvalds#48/5 cgrp_local_storage/cgroup_iter_sleepable:OK torvalds#48/6 cgrp_local_storage/yes_rcu_lock:OK torvalds#48/7 cgrp_local_storage/no_rcu_lock:OK Expanded test cases for cgroup1: torvalds#48/8 cgrp_local_storage/cgrp1_tp_btf:OK torvalds#48/9 cgrp_local_storage/cgrp1_recursion:OK torvalds#48/10 cgrp_local_storage/cgrp1_negative:OK torvalds#48/11 cgrp_local_storage/cgrp1_iter_sleepable:OK torvalds#48/12 cgrp_local_storage/cgrp1_yes_rcu_lock:OK torvalds#48/13 cgrp_local_storage/cgrp1_no_rcu_lock:OK Summary: torvalds#48 cgrp_local_storage:OK Summary: 1/13 PASSED, 0 SKIPPED, 0 FAILED Signed-off-by: Yafang Shao <laoar.shao@gmail.com>

Expanding the test coverage from cgroup2 to include cgroup1. The result as follows, Already existing test cases for cgroup2: torvalds#48/1 cgrp_local_storage/tp_btf:OK torvalds#48/2 cgrp_local_storage/attach_cgroup:OK torvalds#48/3 cgrp_local_storage/recursion:OK torvalds#48/4 cgrp_local_storage/negative:OK torvalds#48/5 cgrp_local_storage/cgroup_iter_sleepable:OK torvalds#48/6 cgrp_local_storage/yes_rcu_lock:OK torvalds#48/7 cgrp_local_storage/no_rcu_lock:OK Expanded test cases for cgroup1: torvalds#48/8 cgrp_local_storage/cgrp1_tp_btf:OK torvalds#48/9 cgrp_local_storage/cgrp1_recursion:OK torvalds#48/10 cgrp_local_storage/cgrp1_negative:OK torvalds#48/11 cgrp_local_storage/cgrp1_iter_sleepable:OK torvalds#48/12 cgrp_local_storage/cgrp1_yes_rcu_lock:OK torvalds#48/13 cgrp_local_storage/cgrp1_no_rcu_lock:OK Summary: torvalds#48 cgrp_local_storage:OK Summary: 1/13 PASSED, 0 SKIPPED, 0 FAILED Signed-off-by: Yafang Shao <laoar.shao@gmail.com> Acked-by: Tejun Heo <tj@kernel.org> Acked-by: Yonghong Song <yonghong.song@linux.dev>

The `cgrp_local_storage` test triggers a kernel panic like: # ./test_progs -t cgrp_local_storage Can't find bpf_testmod.ko kernel module: -2 WARNING! Selftests relying on bpf_testmod.ko will be skipped. [ 550.930632] CPU 1 Unable to handle kernel paging request at virtual address 0000000000000080, era == ffff80000200be34, ra == ffff80000200be00 [ 550.931781] Oops[#1]: [ 550.931966] CPU: 1 PID: 1303 Comm: test_progs Not tainted 6.7.0-rc2-loong-devel-g2f56bb0d2327 torvalds#35 a896aca3f4164f09cc346f89f2e09832e07be5f6 [ 550.932215] Hardware name: QEMU QEMU Virtual Machine, BIOS unknown 2/2/2022 [ 550.932403] pc ffff80000200be34 ra ffff80000200be00 tp 9000000108350000 sp 9000000108353dc0 [ 550.932545] a0 0000000000000000 a1 0000000000000517 a2 0000000000000118 a3 00007ffffbb15558 [ 550.932682] a4 00007ffffbb15620 a5 90000001004e7700 a6 0000000000000021 a7 0000000000000118 [ 550.932824] t0 ffff80000200bdc0 t1 0000000000000517 t2 0000000000000517 t3 00007ffff1c06ee0 [ 550.932961] t4 0000555578ae04d0 t5 fffffffffffffff8 t6 0000000000000004 t7 0000000000000020 [ 550.933097] t8 0000000000000040 u0 00000000000007b8 s9 9000000108353e00 s0 90000001004e7700 [ 550.933241] s1 9000000004005000 s2 0000000000000001 s3 0000000000000000 s4 0000555555eb2ec8 [ 550.933379] s5 00007ffffbb15bb8 s6 00007ffff1dafd60 s7 000055555663f610 s8 00007ffff1db0050 [ 550.933520] ra: ffff80000200be00 bpf_prog_98f1b9e767be2a84_on_enter+0x40/0x200 [ 550.933911] ERA: ffff80000200be34 bpf_prog_98f1b9e767be2a84_on_enter+0x74/0x200 [ 550.934105] CRMD: 000000b0 (PLV0 -IE -DA +PG DACF=CC DACM=CC -WE) [ 550.934596] PRMD: 00000004 (PPLV0 +PIE -PWE) [ 550.934712] EUEN: 00000003 (+FPE +SXE -ASXE -BTE) [ 550.934836] ECFG: 00071c1c (LIE=2-4,10-12 VS=7) [ 550.934976] ESTAT: 00010000 [PIL] (IS= ECode=1 EsubCode=0) [ 550.935097] BADV: 0000000000000080 [ 550.935181] PRID: 0014c010 (Loongson-64bit, Loongson-3A5000) [ 550.935291] Modules linked in: [ 550.935391] Process test_progs (pid: 1303, threadinfo=000000006c3b1c41, task=0000000061f84a55) [ 550.935643] Stack : 00007ffffbb15bb8 0000555555eb2ec8 0000000000000000 0000000000000001 [ 550.935844] 9000000004005000 ffff80001b864000 00007ffffbb15450 90000000029aa034 [ 550.935990] 0000000000000000 9000000108353ec0 0000000000000118 d07d9dfb09721a09 [ 550.936175] 0000000000000001 0000000000000000 9000000108353ec0 0000000000000118 [ 550.936314] 9000000101d46ad0 900000000290abf0 000055555663f610 0000000000000000 [ 550.936479] 0000000000000003 9000000108353ec0 00007ffffbb15450 90000000029d7288 [ 550.936635] 00007ffff1dafd60 000055555663f610 0000000000000000 0000000000000003 [ 550.936779] 9000000108353ec0 90000000035dd1f0 00007ffff1dafd58 9000000002841c5c [ 550.936939] 0000000000000119 0000555555eea5a8 00007ffff1d78780 00007ffffbb153e0 [ 550.937083] ffffffffffffffda 00007ffffbb15518 0000000000000040 00007ffffbb15558 [ 550.937224] ... [ 550.937299] Call Trace: [ 550.937521] [<ffff80000200be34>] bpf_prog_98f1b9e767be2a84_on_enter+0x74/0x200 [ 550.937910] [<90000000029aa034>] bpf_trace_run2+0x90/0x154 [ 550.938105] [<900000000290abf0>] syscall_trace_enter.isra.0+0x1cc/0x200 [ 550.938224] [<90000000035dd1f0>] do_syscall+0x48/0x94 [ 550.938319] [<9000000002841c5c>] handle_syscall+0xbc/0x158 [ 550.938477] [ 550.938607] Code: 580009ae 50016000 262402e4 <28c20085> 14092084 03a00084 16000024 03240084 00150006 [ 550.938851] [ 550.939021] ---[ end trace 0000000000000000 ]--- Further investigation shows that this panic is triggered by memory load operations: ptr = bpf_cgrp_storage_get(&map_a, task->cgroups->dfl_cgrp, 0, BPF_LOCAL_STORAGE_GET_F_CREATE); The expression `task->cgroups->dfl_cgrp` involves two memory load. Since the field offset fits in imm12 or imm14, we use ldd or ldptrd instructions. But both instructions have the side effect that it will signed-extended the imm operand. Finally, we got the wrong addresses and panics is inevitable. Use a generic ldxd instruction to avoid this kind of issues. With this change, we have: # ./test_progs -t cgrp_local_storage Can't find bpf_testmod.ko kernel module: -2 WARNING! Selftests relying on bpf_testmod.ko will be skipped. test_cgrp_local_storage:PASS:join_cgroup /cgrp_local_storage 0 nsec torvalds#48/1 cgrp_local_storage/tp_btf:OK test_attach_cgroup:PASS:skel_open 0 nsec test_attach_cgroup:PASS:prog_attach 0 nsec test_attach_cgroup:PASS:prog_attach 0 nsec libbpf: prog 'update_cookie_tracing': failed to attach: ERROR: strerror_r(-524)=22 test_attach_cgroup:FAIL:prog_attach unexpected error: -524 torvalds#48/2 cgrp_local_storage/attach_cgroup:FAIL test_recursion:PASS:skel_open_and_load 0 nsec libbpf: prog 'on_lookup': failed to attach: ERROR: strerror_r(-524)=22 libbpf: prog 'on_lookup': failed to auto-attach: -524 test_recursion:FAIL:skel_attach unexpected error: -524 (errno 524) torvalds#48/3 cgrp_local_storage/recursion:FAIL torvalds#48/4 cgrp_local_storage/negative:OK torvalds#48/5 cgrp_local_storage/cgroup_iter_sleepable:OK test_yes_rcu_lock:PASS:skel_open 0 nsec test_yes_rcu_lock:PASS:skel_load 0 nsec libbpf: prog 'yes_rcu_lock': failed to attach: ERROR: strerror_r(-524)=22 libbpf: prog 'yes_rcu_lock': failed to auto-attach: -524 test_yes_rcu_lock:FAIL:skel_attach unexpected error: -524 (errno 524) torvalds#48/6 cgrp_local_storage/yes_rcu_lock:FAIL torvalds#48/7 cgrp_local_storage/no_rcu_lock:OK torvalds#48 cgrp_local_storage:FAIL All error logs: test_cgrp_local_storage:PASS:join_cgroup /cgrp_local_storage 0 nsec test_attach_cgroup:PASS:skel_open 0 nsec test_attach_cgroup:PASS:prog_attach 0 nsec test_attach_cgroup:PASS:prog_attach 0 nsec libbpf: prog 'update_cookie_tracing': failed to attach: ERROR: strerror_r(-524)=22 test_attach_cgroup:FAIL:prog_attach unexpected error: -524 torvalds#48/2 cgrp_local_storage/attach_cgroup:FAIL test_recursion:PASS:skel_open_and_load 0 nsec libbpf: prog 'on_lookup': failed to attach: ERROR: strerror_r(-524)=22 libbpf: prog 'on_lookup': failed to auto-attach: -524 test_recursion:FAIL:skel_attach unexpected error: -524 (errno 524) torvalds#48/3 cgrp_local_storage/recursion:FAIL test_yes_rcu_lock:PASS:skel_open 0 nsec test_yes_rcu_lock:PASS:skel_load 0 nsec libbpf: prog 'yes_rcu_lock': failed to attach: ERROR: strerror_r(-524)=22 libbpf: prog 'yes_rcu_lock': failed to auto-attach: -524 test_yes_rcu_lock:FAIL:skel_attach unexpected error: -524 (errno 524) torvalds#48/6 cgrp_local_storage/yes_rcu_lock:FAIL torvalds#48 cgrp_local_storage:FAIL Summary: 0/4 PASSED, 0 SKIPPED, 1 FAILED No panics any more (The test still failed because lack of BPF trampoline which I am actively working on). Fixes: 5dc6155 ("LoongArch: Add BPF JIT support") Signed-off-by: Hengqi Chen <hengqi.chen@gmail.com> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>

Expanding the test coverage from cgroup2 to include cgroup1. The result as follows, Already existing test cases for cgroup2: torvalds#48/1 cgrp_local_storage/tp_btf:OK torvalds#48/2 cgrp_local_storage/attach_cgroup:OK torvalds#48/3 cgrp_local_storage/recursion:OK torvalds#48/4 cgrp_local_storage/negative:OK torvalds#48/5 cgrp_local_storage/cgroup_iter_sleepable:OK torvalds#48/6 cgrp_local_storage/yes_rcu_lock:OK torvalds#48/7 cgrp_local_storage/no_rcu_lock:OK Expanded test cases for cgroup1: torvalds#48/8 cgrp_local_storage/cgrp1_tp_btf:OK torvalds#48/9 cgrp_local_storage/cgrp1_recursion:OK torvalds#48/10 cgrp_local_storage/cgrp1_negative:OK torvalds#48/11 cgrp_local_storage/cgrp1_iter_sleepable:OK torvalds#48/12 cgrp_local_storage/cgrp1_yes_rcu_lock:OK torvalds#48/13 cgrp_local_storage/cgrp1_no_rcu_lock:OK Summary: torvalds#48 cgrp_local_storage:OK Summary: 1/13 PASSED, 0 SKIPPED, 0 FAILED Signed-off-by: Yafang Shao <laoar.shao@gmail.com> Acked-by: Tejun Heo <tj@kernel.org> Acked-by: Yonghong Song <yonghong.song@linux.dev> Link: https://lore.kernel.org/r/20231206115326.4295-4-laoar.shao@gmail.com Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>

The `cgrp_local_storage` test triggers a kernel panic like: # ./test_progs -t cgrp_local_storage Can't find bpf_testmod.ko kernel module: -2 WARNING! Selftests relying on bpf_testmod.ko will be skipped. [ 550.930632] CPU 1 Unable to handle kernel paging request at virtual address 0000000000000080, era == ffff80000200be34, ra == ffff80000200be00 [ 550.931781] Oops[#1]: [ 550.931966] CPU: 1 PID: 1303 Comm: test_progs Not tainted 6.7.0-rc2-loong-devel-g2f56bb0d2327 #35 a896aca3f4164f09cc346f89f2e09832e07be5f6 [ 550.932215] Hardware name: QEMU QEMU Virtual Machine, BIOS unknown 2/2/2022 [ 550.932403] pc ffff80000200be34 ra ffff80000200be00 tp 9000000108350000 sp 9000000108353dc0 [ 550.932545] a0 0000000000000000 a1 0000000000000517 a2 0000000000000118 a3 00007ffffbb15558 [ 550.932682] a4 00007ffffbb15620 a5 90000001004e7700 a6 0000000000000021 a7 0000000000000118 [ 550.932824] t0 ffff80000200bdc0 t1 0000000000000517 t2 0000000000000517 t3 00007ffff1c06ee0 [ 550.932961] t4 0000555578ae04d0 t5 fffffffffffffff8 t6 0000000000000004 t7 0000000000000020 [ 550.933097] t8 0000000000000040 u0 00000000000007b8 s9 9000000108353e00 s0 90000001004e7700 [ 550.933241] s1 9000000004005000 s2 0000000000000001 s3 0000000000000000 s4 0000555555eb2ec8 [ 550.933379] s5 00007ffffbb15bb8 s6 00007ffff1dafd60 s7 000055555663f610 s8 00007ffff1db0050 [ 550.933520] ra: ffff80000200be00 bpf_prog_98f1b9e767be2a84_on_enter+0x40/0x200 [ 550.933911] ERA: ffff80000200be34 bpf_prog_98f1b9e767be2a84_on_enter+0x74/0x200 [ 550.934105] CRMD: 000000b0 (PLV0 -IE -DA +PG DACF=CC DACM=CC -WE) [ 550.934596] PRMD: 00000004 (PPLV0 +PIE -PWE) [ 550.934712] EUEN: 00000003 (+FPE +SXE -ASXE -BTE) [ 550.934836] ECFG: 00071c1c (LIE=2-4,10-12 VS=7) [ 550.934976] ESTAT: 00010000 [PIL] (IS= ECode=1 EsubCode=0) [ 550.935097] BADV: 0000000000000080 [ 550.935181] PRID: 0014c010 (Loongson-64bit, Loongson-3A5000) [ 550.935291] Modules linked in: [ 550.935391] Process test_progs (pid: 1303, threadinfo=000000006c3b1c41, task=0000000061f84a55) [ 550.935643] Stack : 00007ffffbb15bb8 0000555555eb2ec8 0000000000000000 0000000000000001 [ 550.935844] 9000000004005000 ffff80001b864000 00007ffffbb15450 90000000029aa034 [ 550.935990] 0000000000000000 9000000108353ec0 0000000000000118 d07d9dfb09721a09 [ 550.936175] 0000000000000001 0000000000000000 9000000108353ec0 0000000000000118 [ 550.936314] 9000000101d46ad0 900000000290abf0 000055555663f610 0000000000000000 [ 550.936479] 0000000000000003 9000000108353ec0 00007ffffbb15450 90000000029d7288 [ 550.936635] 00007ffff1dafd60 000055555663f610 0000000000000000 0000000000000003 [ 550.936779] 9000000108353ec0 90000000035dd1f0 00007ffff1dafd58 9000000002841c5c [ 550.936939] 0000000000000119 0000555555eea5a8 00007ffff1d78780 00007ffffbb153e0 [ 550.937083] ffffffffffffffda 00007ffffbb15518 0000000000000040 00007ffffbb15558 [ 550.937224] ... [ 550.937299] Call Trace: [ 550.937521] [<ffff80000200be34>] bpf_prog_98f1b9e767be2a84_on_enter+0x74/0x200 [ 550.937910] [<90000000029aa034>] bpf_trace_run2+0x90/0x154 [ 550.938105] [<900000000290abf0>] syscall_trace_enter.isra.0+0x1cc/0x200 [ 550.938224] [<90000000035dd1f0>] do_syscall+0x48/0x94 [ 550.938319] [<9000000002841c5c>] handle_syscall+0xbc/0x158 [ 550.938477] [ 550.938607] Code: 580009ae 50016000 262402e4 <28c20085> 14092084 03a00084 16000024 03240084 00150006 [ 550.938851] [ 550.939021] ---[ end trace 0000000000000000 ]--- Further investigation shows that this panic is triggered by memory load operations: ptr = bpf_cgrp_storage_get(&map_a, task->cgroups->dfl_cgrp, 0, BPF_LOCAL_STORAGE_GET_F_CREATE); The expression `task->cgroups->dfl_cgrp` involves two memory load. Since the field offset fits in imm12 or imm14, we use ldd or ldptrd instructions. But both instructions have the side effect that it will signed-extended the imm operand. Finally, we got the wrong addresses and panics is inevitable. Use a generic ldxd instruction to avoid this kind of issues. With this change, we have: # ./test_progs -t cgrp_local_storage Can't find bpf_testmod.ko kernel module: -2 WARNING! Selftests relying on bpf_testmod.ko will be skipped. test_cgrp_local_storage:PASS:join_cgroup /cgrp_local_storage 0 nsec #48/1 cgrp_local_storage/tp_btf:OK test_attach_cgroup:PASS:skel_open 0 nsec test_attach_cgroup:PASS:prog_attach 0 nsec test_attach_cgroup:PASS:prog_attach 0 nsec libbpf: prog 'update_cookie_tracing': failed to attach: ERROR: strerror_r(-524)=22 test_attach_cgroup:FAIL:prog_attach unexpected error: -524 #48/2 cgrp_local_storage/attach_cgroup:FAIL test_recursion:PASS:skel_open_and_load 0 nsec libbpf: prog 'on_lookup': failed to attach: ERROR: strerror_r(-524)=22 libbpf: prog 'on_lookup': failed to auto-attach: -524 test_recursion:FAIL:skel_attach unexpected error: -524 (errno 524) #48/3 cgrp_local_storage/recursion:FAIL #48/4 cgrp_local_storage/negative:OK #48/5 cgrp_local_storage/cgroup_iter_sleepable:OK test_yes_rcu_lock:PASS:skel_open 0 nsec test_yes_rcu_lock:PASS:skel_load 0 nsec libbpf: prog 'yes_rcu_lock': failed to attach: ERROR: strerror_r(-524)=22 libbpf: prog 'yes_rcu_lock': failed to auto-attach: -524 test_yes_rcu_lock:FAIL:skel_attach unexpected error: -524 (errno 524) #48/6 cgrp_local_storage/yes_rcu_lock:FAIL #48/7 cgrp_local_storage/no_rcu_lock:OK #48 cgrp_local_storage:FAIL All error logs: test_cgrp_local_storage:PASS:join_cgroup /cgrp_local_storage 0 nsec test_attach_cgroup:PASS:skel_open 0 nsec test_attach_cgroup:PASS:prog_attach 0 nsec test_attach_cgroup:PASS:prog_attach 0 nsec libbpf: prog 'update_cookie_tracing': failed to attach: ERROR: strerror_r(-524)=22 test_attach_cgroup:FAIL:prog_attach unexpected error: -524 #48/2 cgrp_local_storage/attach_cgroup:FAIL test_recursion:PASS:skel_open_and_load 0 nsec libbpf: prog 'on_lookup': failed to attach: ERROR: strerror_r(-524)=22 libbpf: prog 'on_lookup': failed to auto-attach: -524 test_recursion:FAIL:skel_attach unexpected error: -524 (errno 524) #48/3 cgrp_local_storage/recursion:FAIL test_yes_rcu_lock:PASS:skel_open 0 nsec test_yes_rcu_lock:PASS:skel_load 0 nsec libbpf: prog 'yes_rcu_lock': failed to attach: ERROR: strerror_r(-524)=22 libbpf: prog 'yes_rcu_lock': failed to auto-attach: -524 test_yes_rcu_lock:FAIL:skel_attach unexpected error: -524 (errno 524) #48/6 cgrp_local_storage/yes_rcu_lock:FAIL #48 cgrp_local_storage:FAIL Summary: 0/4 PASSED, 0 SKIPPED, 1 FAILED No panics any more (The test still failed because lack of BPF trampoline which I am actively working on). Fixes: 5dc6155 ("LoongArch: Add BPF JIT support") Signed-off-by: Hengqi Chen <hengqi.chen@gmail.com> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>

[ Upstream commit fe57575 ] The `cgrp_local_storage` test triggers a kernel panic like: # ./test_progs -t cgrp_local_storage Can't find bpf_testmod.ko kernel module: -2 WARNING! Selftests relying on bpf_testmod.ko will be skipped. [ 550.930632] CPU 1 Unable to handle kernel paging request at virtual address 0000000000000080, era == ffff80000200be34, ra == ffff80000200be00 [ 550.931781] Oops[#1]: [ 550.931966] CPU: 1 PID: 1303 Comm: test_progs Not tainted 6.7.0-rc2-loong-devel-g2f56bb0d2327 torvalds#35 a896aca3f4164f09cc346f89f2e09832e07be5f6 [ 550.932215] Hardware name: QEMU QEMU Virtual Machine, BIOS unknown 2/2/2022 [ 550.932403] pc ffff80000200be34 ra ffff80000200be00 tp 9000000108350000 sp 9000000108353dc0 [ 550.932545] a0 0000000000000000 a1 0000000000000517 a2 0000000000000118 a3 00007ffffbb15558 [ 550.932682] a4 00007ffffbb15620 a5 90000001004e7700 a6 0000000000000021 a7 0000000000000118 [ 550.932824] t0 ffff80000200bdc0 t1 0000000000000517 t2 0000000000000517 t3 00007ffff1c06ee0 [ 550.932961] t4 0000555578ae04d0 t5 fffffffffffffff8 t6 0000000000000004 t7 0000000000000020 [ 550.933097] t8 0000000000000040 u0 00000000000007b8 s9 9000000108353e00 s0 90000001004e7700 [ 550.933241] s1 9000000004005000 s2 0000000000000001 s3 0000000000000000 s4 0000555555eb2ec8 [ 550.933379] s5 00007ffffbb15bb8 s6 00007ffff1dafd60 s7 000055555663f610 s8 00007ffff1db0050 [ 550.933520] ra: ffff80000200be00 bpf_prog_98f1b9e767be2a84_on_enter+0x40/0x200 [ 550.933911] ERA: ffff80000200be34 bpf_prog_98f1b9e767be2a84_on_enter+0x74/0x200 [ 550.934105] CRMD: 000000b0 (PLV0 -IE -DA +PG DACF=CC DACM=CC -WE) [ 550.934596] PRMD: 00000004 (PPLV0 +PIE -PWE) [ 550.934712] EUEN: 00000003 (+FPE +SXE -ASXE -BTE) [ 550.934836] ECFG: 00071c1c (LIE=2-4,10-12 VS=7) [ 550.934976] ESTAT: 00010000 [PIL] (IS= ECode=1 EsubCode=0) [ 550.935097] BADV: 0000000000000080 [ 550.935181] PRID: 0014c010 (Loongson-64bit, Loongson-3A5000) [ 550.935291] Modules linked in: [ 550.935391] Process test_progs (pid: 1303, threadinfo=000000006c3b1c41, task=0000000061f84a55) [ 550.935643] Stack : 00007ffffbb15bb8 0000555555eb2ec8 0000000000000000 0000000000000001 [ 550.935844] 9000000004005000 ffff80001b864000 00007ffffbb15450 90000000029aa034 [ 550.935990] 0000000000000000 9000000108353ec0 0000000000000118 d07d9dfb09721a09 [ 550.936175] 0000000000000001 0000000000000000 9000000108353ec0 0000000000000118 [ 550.936314] 9000000101d46ad0 900000000290abf0 000055555663f610 0000000000000000 [ 550.936479] 0000000000000003 9000000108353ec0 00007ffffbb15450 90000000029d7288 [ 550.936635] 00007ffff1dafd60 000055555663f610 0000000000000000 0000000000000003 [ 550.936779] 9000000108353ec0 90000000035dd1f0 00007ffff1dafd58 9000000002841c5c [ 550.936939] 0000000000000119 0000555555eea5a8 00007ffff1d78780 00007ffffbb153e0 [ 550.937083] ffffffffffffffda 00007ffffbb15518 0000000000000040 00007ffffbb15558 [ 550.937224] ... [ 550.937299] Call Trace: [ 550.937521] [<ffff80000200be34>] bpf_prog_98f1b9e767be2a84_on_enter+0x74/0x200 [ 550.937910] [<90000000029aa034>] bpf_trace_run2+0x90/0x154 [ 550.938105] [<900000000290abf0>] syscall_trace_enter.isra.0+0x1cc/0x200 [ 550.938224] [<90000000035dd1f0>] do_syscall+0x48/0x94 [ 550.938319] [<9000000002841c5c>] handle_syscall+0xbc/0x158 [ 550.938477] [ 550.938607] Code: 580009ae 50016000 262402e4 <28c20085> 14092084 03a00084 16000024 03240084 00150006 [ 550.938851] [ 550.939021] ---[ end trace 0000000000000000 ]--- Further investigation shows that this panic is triggered by memory load operations: ptr = bpf_cgrp_storage_get(&map_a, task->cgroups->dfl_cgrp, 0, BPF_LOCAL_STORAGE_GET_F_CREATE); The expression `task->cgroups->dfl_cgrp` involves two memory load. Since the field offset fits in imm12 or imm14, we use ldd or ldptrd instructions. But both instructions have the side effect that it will signed-extended the imm operand. Finally, we got the wrong addresses and panics is inevitable. Use a generic ldxd instruction to avoid this kind of issues. With this change, we have: # ./test_progs -t cgrp_local_storage Can't find bpf_testmod.ko kernel module: -2 WARNING! Selftests relying on bpf_testmod.ko will be skipped. test_cgrp_local_storage:PASS:join_cgroup /cgrp_local_storage 0 nsec torvalds#48/1 cgrp_local_storage/tp_btf:OK test_attach_cgroup:PASS:skel_open 0 nsec test_attach_cgroup:PASS:prog_attach 0 nsec test_attach_cgroup:PASS:prog_attach 0 nsec libbpf: prog 'update_cookie_tracing': failed to attach: ERROR: strerror_r(-524)=22 test_attach_cgroup:FAIL:prog_attach unexpected error: -524 torvalds#48/2 cgrp_local_storage/attach_cgroup:FAIL test_recursion:PASS:skel_open_and_load 0 nsec libbpf: prog 'on_lookup': failed to attach: ERROR: strerror_r(-524)=22 libbpf: prog 'on_lookup': failed to auto-attach: -524 test_recursion:FAIL:skel_attach unexpected error: -524 (errno 524) torvalds#48/3 cgrp_local_storage/recursion:FAIL torvalds#48/4 cgrp_local_storage/negative:OK torvalds#48/5 cgrp_local_storage/cgroup_iter_sleepable:OK test_yes_rcu_lock:PASS:skel_open 0 nsec test_yes_rcu_lock:PASS:skel_load 0 nsec libbpf: prog 'yes_rcu_lock': failed to attach: ERROR: strerror_r(-524)=22 libbpf: prog 'yes_rcu_lock': failed to auto-attach: -524 test_yes_rcu_lock:FAIL:skel_attach unexpected error: -524 (errno 524) torvalds#48/6 cgrp_local_storage/yes_rcu_lock:FAIL torvalds#48/7 cgrp_local_storage/no_rcu_lock:OK torvalds#48 cgrp_local_storage:FAIL All error logs: test_cgrp_local_storage:PASS:join_cgroup /cgrp_local_storage 0 nsec test_attach_cgroup:PASS:skel_open 0 nsec test_attach_cgroup:PASS:prog_attach 0 nsec test_attach_cgroup:PASS:prog_attach 0 nsec libbpf: prog 'update_cookie_tracing': failed to attach: ERROR: strerror_r(-524)=22 test_attach_cgroup:FAIL:prog_attach unexpected error: -524 torvalds#48/2 cgrp_local_storage/attach_cgroup:FAIL test_recursion:PASS:skel_open_and_load 0 nsec libbpf: prog 'on_lookup': failed to attach: ERROR: strerror_r(-524)=22 libbpf: prog 'on_lookup': failed to auto-attach: -524 test_recursion:FAIL:skel_attach unexpected error: -524 (errno 524) torvalds#48/3 cgrp_local_storage/recursion:FAIL test_yes_rcu_lock:PASS:skel_open 0 nsec test_yes_rcu_lock:PASS:skel_load 0 nsec libbpf: prog 'yes_rcu_lock': failed to attach: ERROR: strerror_r(-524)=22 libbpf: prog 'yes_rcu_lock': failed to auto-attach: -524 test_yes_rcu_lock:FAIL:skel_attach unexpected error: -524 (errno 524) torvalds#48/6 cgrp_local_storage/yes_rcu_lock:FAIL torvalds#48 cgrp_local_storage:FAIL Summary: 0/4 PASSED, 0 SKIPPED, 1 FAILED No panics any more (The test still failed because lack of BPF trampoline which I am actively working on). Fixes: 5dc6155 ("LoongArch: Add BPF JIT support") Signed-off-by: Hengqi Chen <hengqi.chen@gmail.com> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn> Signed-off-by: Sasha Levin <sashal@kernel.org>

Expanding the test coverage from cgroup2 to include cgroup1. The result as follows, Already existing test cases for cgroup2: torvalds#48/1 cgrp_local_storage/tp_btf:OK torvalds#48/2 cgrp_local_storage/attach_cgroup:OK torvalds#48/3 cgrp_local_storage/recursion:OK torvalds#48/4 cgrp_local_storage/negative:OK torvalds#48/5 cgrp_local_storage/cgroup_iter_sleepable:OK torvalds#48/6 cgrp_local_storage/yes_rcu_lock:OK torvalds#48/7 cgrp_local_storage/no_rcu_lock:OK Expanded test cases for cgroup1: torvalds#48/8 cgrp_local_storage/cgrp1_tp_btf:OK torvalds#48/9 cgrp_local_storage/cgrp1_recursion:OK torvalds#48/10 cgrp_local_storage/cgrp1_negative:OK torvalds#48/11 cgrp_local_storage/cgrp1_iter_sleepable:OK torvalds#48/12 cgrp_local_storage/cgrp1_yes_rcu_lock:OK torvalds#48/13 cgrp_local_storage/cgrp1_no_rcu_lock:OK Summary: torvalds#48 cgrp_local_storage:OK Summary: 1/13 PASSED, 0 SKIPPED, 0 FAILED Signed-off-by: Yafang Shao <laoar.shao@gmail.com> Acked-by: Tejun Heo <tj@kernel.org> Acked-by: Yonghong Song <yonghong.song@linux.dev> Link: https://lore.kernel.org/r/20231206115326.4295-4-laoar.shao@gmail.com Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Signed-off-by: Andrii Nakryiko <andrii@kernel.org>

scx_rusty: keep .bpf.o files for debugging

This reverts commit 664d650, reversing changes made to ee9077a.

Fix incorrect merge of torvalds#48

KSAN calls into rcu code which then triggers a write that reenters into KSAN getting the system stuck doing infinite recursion. #0 kmsan_get_context () at mm/kmsan/kmsan.h:106 #1 __msan_get_context_state () at mm/kmsan/instrumentation.c:331 #2 0xffffffff81495671 in get_current () at ./arch/x86/include/asm/current.h:42 #3 rcu_preempt_read_enter () at kernel/rcu/tree_plugin.h:379 #4 __rcu_read_lock () at kernel/rcu/tree_plugin.h:402 #5 0xffffffff81b2054b in rcu_read_lock () at ./include/linux/rcupdate.h:748 torvalds#6 pfn_valid (pfn=<optimized out>) at ./include/linux/mmzone.h:2016 torvalds#7 kmsan_virt_addr_valid (addr=addr@entry=0xffffffff8620d974 <init_task+1012>) at ./arch/x86/include/asm/kmsan.h:82 torvalds#8 virt_to_page_or_null (vaddr=vaddr@entry=0xffffffff8620d974 <init_task+1012>) at mm/kmsan/shadow.c:75 torvalds#9 0xffffffff81b2023c in kmsan_get_metadata (address=0xffffffff8620d974 <init_task+1012>, is_origin=false) at mm/kmsan/shadow.c:143 torvalds#10 kmsan_get_shadow_origin_ptr (address=0xffffffff8620d974 <init_task+1012>, size=4, store=false) at mm/kmsan/shadow.c:97 torvalds#11 0xffffffff81b1dbd2 in get_shadow_origin_ptr (addr=0xffffffff8620d974 <init_task+1012>, size=4, store=false) at mm/kmsan/instrumentation.c:36 torvalds#12 __msan_metadata_ptr_for_load_4 (addr=0xffffffff8620d974 <init_task+1012>) at mm/kmsan/instrumentation.c:91 torvalds#13 0xffffffff8149568f in rcu_preempt_read_enter () at kernel/rcu/tree_plugin.h:379 torvalds#14 __rcu_read_lock () at kernel/rcu/tree_plugin.h:402 torvalds#15 0xffffffff81b2054b in rcu_read_lock () at ./include/linux/rcupdate.h:748 torvalds#16 pfn_valid (pfn=<optimized out>) at ./include/linux/mmzone.h:2016 torvalds#17 kmsan_virt_addr_valid (addr=addr@entry=0xffffffff8620d974 <init_task+1012>) at ./arch/x86/include/asm/kmsan.h:82 torvalds#18 virt_to_page_or_null (vaddr=vaddr@entry=0xffffffff8620d974 <init_task+1012>) at mm/kmsan/shadow.c:75 torvalds#19 0xffffffff81b2023c in kmsan_get_metadata (address=0xffffffff8620d974 <init_task+1012>, is_origin=false) at mm/kmsan/shadow.c:143 torvalds#20 kmsan_get_shadow_origin_ptr (address=0xffffffff8620d974 <init_task+1012>, size=4, store=false) at mm/kmsan/shadow.c:97 torvalds#21 0xffffffff81b1dbd2 in get_shadow_origin_ptr (addr=0xffffffff8620d974 <init_task+1012>, size=4, store=false) at mm/kmsan/instrumentation.c:36 torvalds#22 __msan_metadata_ptr_for_load_4 (addr=0xffffffff8620d974 <init_task+1012>) at mm/kmsan/instrumentation.c:91 torvalds#23 0xffffffff8149568f in rcu_preempt_read_enter () at kernel/rcu/tree_plugin.h:379 torvalds#24 __rcu_read_lock () at kernel/rcu/tree_plugin.h:402 torvalds#25 0xffffffff81b2054b in rcu_read_lock () at ./include/linux/rcupdate.h:748 torvalds#26 pfn_valid (pfn=<optimized out>) at ./include/linux/mmzone.h:2016 torvalds#27 kmsan_virt_addr_valid (addr=addr@entry=0xffffffff8620d974 <init_task+1012>) at ./arch/x86/include/asm/kmsan.h:82 torvalds#28 virt_to_page_or_null (vaddr=vaddr@entry=0xffffffff8620d974 <init_task+1012>) at mm/kmsan/shadow.c:75 torvalds#29 0xffffffff81b2023c in kmsan_get_metadata (address=0xffffffff8620d974 <init_task+1012>, is_origin=false) at mm/kmsan/shadow.c:143 torvalds#30 kmsan_get_shadow_origin_ptr (address=0xffffffff8620d974 <init_task+1012>, size=4, store=false) at mm/kmsan/shadow.c:97 torvalds#31 0xffffffff81b1dbd2 in get_shadow_origin_ptr (addr=0xffffffff8620d974 <init_task+1012>, size=4, store=false) at mm/kmsan/instrumentation.c:36 torvalds#32 __msan_metadata_ptr_for_load_4 (addr=0xffffffff8620d974 <init_task+1012>) at mm/kmsan/instrumentation.c:91 torvalds#33 0xffffffff8149568f in rcu_preempt_read_enter () at kernel/rcu/tree_plugin.h:379 torvalds#34 __rcu_read_lock () at kernel/rcu/tree_plugin.h:402 torvalds#35 0xffffffff81b2054b in rcu_read_lock () at ./include/linux/rcupdate.h:748 torvalds#36 pfn_valid (pfn=<optimized out>) at ./include/linux/mmzone.h:2016 torvalds#37 kmsan_virt_addr_valid (addr=addr@entry=0xffffffff8620d974 <init_task+1012>) at ./arch/x86/include/asm/kmsan.h:82 torvalds#38 virt_to_page_or_null (vaddr=vaddr@entry=0xffffffff8620d974 <init_task+1012>) at mm/kmsan/shadow.c:75 torvalds#39 0xffffffff81b2023c in kmsan_get_metadata (address=0xffffffff8620d974 <init_task+1012>, is_origin=false) at mm/kmsan/shadow.c:143 torvalds#40 kmsan_get_shadow_origin_ptr (address=0xffffffff8620d974 <init_task+1012>, size=4, store=false) at mm/kmsan/shadow.c:97 torvalds#41 0xffffffff81b1dbd2 in get_shadow_origin_ptr (addr=0xffffffff8620d974 <init_task+1012>, size=4, store=false) at mm/kmsan/instrumentation.c:36 torvalds#42 __msan_metadata_ptr_for_load_4 (addr=0xffffffff8620d974 <init_task+1012>) at mm/kmsan/instrumentation.c:91 torvalds#43 0xffffffff8149568f in rcu_preempt_read_enter () at kernel/rcu/tree_plugin.h:379 torvalds#44 __rcu_read_lock () at kernel/rcu/tree_plugin.h:402 torvalds#45 0xffffffff81b2054b in rcu_read_lock () at ./include/linux/rcupdate.h:748 torvalds#46 pfn_valid (pfn=<optimized out>) at ./include/linux/mmzone.h:2016 torvalds#47 kmsan_virt_addr_valid (addr=addr@entry=0xffffffff8620d974 <init_task+1012>) at ./arch/x86/include/asm/kmsan.h:82 torvalds#48 virt_to_page_or_null (vaddr=vaddr@entry=0xffffffff8620d974 <init_task+1012>) at mm/kmsan/shadow.c:75 torvalds#49 0xffffffff81b2023c in kmsan_get_metadata (address=0xffffffff8620d974 <init_task+1012>, is_origin=false) at mm/kmsan/shadow.c:143 torvalds#50 kmsan_get_shadow_origin_ptr (address=0xffffffff8620d974 <init_task+1012>, size=4, store=false) at mm/kmsan/shadow.c:97 torvalds#51 0xffffffff81b1dbd2 in get_shadow_origin_ptr (addr=0xffffffff8620d974 <init_task+1012>, size=4, store=false) at mm/kmsan/instrumentation.c:36 #52 __msan_metadata_ptr_for_load_4 (addr=0xffffffff8620d974 <init_task+1012>) at mm/kmsan/instrumentation.c:91 #53 0xffffffff8149568f in rcu_preempt_read_enter () at kernel/rcu/tree_plugin.h:379 torvalds#54 __rcu_read_lock () at kernel/rcu/tree_plugin.h:402 torvalds#55 0xffffffff81b2054b in rcu_read_lock () at ./include/linux/rcupdate.h:748 torvalds#56 pfn_valid (pfn=<optimized out>) at ./include/linux/mmzone.h:2016 torvalds#57 kmsan_virt_addr_valid (addr=addr@entry=0xffffffff8620d974 <init_task+1012>) at ./arch/x86/include/asm/kmsan.h:82 #58 virt_to_page_or_null (vaddr=vaddr@entry=0xffffffff8620d974 <init_task+1012>) at mm/kmsan/shadow.c:75 torvalds#59 0xffffffff81b2023c in kmsan_get_metadata (address=0xffffffff8620d974 <init_task+1012>, is_origin=false) at mm/kmsan/shadow.c:143 torvalds#60 kmsan_get_shadow_origin_ptr (address=0xffffffff8620d974 <init_task+1012>, size=4, store=false) at mm/kmsan/shadow.c:97 torvalds#61 0xffffffff81b1dbd2 in get_shadow_origin_ptr (addr=0xffffffff8620d974 <init_task+1012>, size=4, store=false) at mm/kmsan/instrumentation.c:36 torvalds#62 __msan_metadata_ptr_for_load_4 (addr=0xffffffff8620d974 <init_task+1012>) at mm/kmsan/instrumentation.c:91 torvalds#63 0xffffffff8149568f in rcu_preempt_read_enter () at kernel/rcu/tree_plugin.h:379 torvalds#64 __rcu_read_lock () at kernel/rcu/tree_plugin.h:402 torvalds#65 0xffffffff81b2054b in rcu_read_lock () at ./include/linux/rcupdate.h:748 torvalds#66 pfn_valid (pfn=<optimized out>) at ./include/linux/mmzone.h:2016 torvalds#67 kmsan_virt_addr_valid (addr=addr@entry=0xffffffff8620d974 <init_task+1012>) at ./arch/x86/include/asm/kmsan.h:82 torvalds#68 virt_to_page_or_null (vaddr=vaddr@entry=0xffffffff8620d974 <init_task+1012>) at mm/kmsan/shadow.c:75 torvalds#69 0xffffffff81b2023c in kmsan_get_metadata (address=0xffffffff8620d974 <init_task+1012>, is_origin=false) at mm/kmsan/shadow.c:143 #70 kmsan_get_shadow_origin_ptr (address=0xffffffff8620d974 <init_task+1012>, size=4, store=false) at mm/kmsan/shadow.c:97 torvalds#71 0xffffffff81b1dbd2 in get_shadow_origin_ptr (addr=0xffffffff8620d974 <init_task+1012>, size=4, store=false) at mm/kmsan/instrumentation.c:36 torvalds#72 __msan_metadata_ptr_for_load_4 (addr=0xffffffff8620d974 <init_task+1012>) at mm/kmsan/instrumentation.c:91 torvalds#73 0xffffffff8149568f in rcu_preempt_read_enter () at kernel/rcu/tree_plugin.h:379 torvalds#74 __rcu_read_lock () at kernel/rcu/tree_plugin.h:402 torvalds#75 0xffffffff81b2054b in rcu_read_lock () at ./include/linux/rcupdate.h:748 torvalds#76 pfn_valid (pfn=<optimized out>) at ./include/linux/mmzone.h:2016 torvalds#77 kmsan_virt_addr_valid (addr=addr@entry=0xffffffff86203c90) at ./arch/x86/include/asm/kmsan.h:82 torvalds#78 virt_to_page_or_null (vaddr=vaddr@entry=0xffffffff86203c90) at mm/kmsan/shadow.c:75 torvalds#79 0xffffffff81b2023c in kmsan_get_metadata (address=0xffffffff86203c90, is_origin=false) at mm/kmsan/shadow.c:143 torvalds#80 kmsan_get_shadow_origin_ptr (address=0xffffffff86203c90, size=8, store=false) at mm/kmsan/shadow.c:97 torvalds#81 0xffffffff81b1dc72 in get_shadow_origin_ptr (addr=0xffffffff8620d974 <init_task+1012>, size=8, store=false) at mm/kmsan/instrumentation.c:36 torvalds#82 __msan_metadata_ptr_for_load_8 (addr=0xffffffff8620d974 <init_task+1012>) at mm/kmsan/instrumentation.c:92 torvalds#83 0xffffffff814fdb9e in filter_irq_stacks (entries=<optimized out>, nr_entries=4) at kernel/stacktrace.c:397 torvalds#84 0xffffffff829520e8 in stack_depot_save_flags (entries=0xffffffff8620d974 <init_task+1012>, nr_entries=4, alloc_flags=0, depot_flags=0) at lib/stackdepot.c:500 torvalds#85 0xffffffff81b1e560 in __msan_poison_alloca (address=0xffffffff86203da0, size=24, descr=<optimized out>) at mm/kmsan/instrumentation.c:285 torvalds#86 0xffffffff8562821c in _printk (fmt=0xffffffff85f191a5 "\0016Attempting lock1") at kernel/printk/printk.c:2324 torvalds#87 0xffffffff81942aa2 in kmem_cache_create_usercopy (name=0xffffffff85f18903 "mm_struct", size=1296, align=0, flags=270336, useroffset=<optimized out>, usersize=<optimized out>, ctor=0x0 <fixed_percpu_data>) at mm/slab_common.c:296 torvalds#88 0xffffffff86f337a0 in mm_cache_init () at kernel/fork.c:3262 torvalds#89 0xffffffff86eacb8e in start_kernel () at init/main.c:932 torvalds#90 0xffffffff86ecdf94 in x86_64_start_reservations (real_mode_data=0x140e0 <exception_stacks+28896> <error: Cannot access memory at address 0x140e0>) at arch/x86/kernel/head64.c:555 torvalds#91 0xffffffff86ecde9b in x86_64_start_kernel (real_mode_data=0x140e0 <exception_stacks+28896> <error: Cannot access memory at address 0x140e0>) at arch/x86/kernel/head64.c:536 torvalds#92 0xffffffff810001d3 in secondary_startup_64 () at /pool/workspace/linux/arch/x86/kernel/head_64.S:461 torvalds#93 0x0000000000000000 in ??

As of 5ec8e8e(mm/sparsemem: fix race in accessing memory_section->usage) KMSAN now calls into RCU tree code during kmsan_get_metadata. This will trigger a write that will reenter into KMSAN getting the system stuck doing infinite recursion. #0 kmsan_get_context () at mm/kmsan/kmsan.h:106 #1 __msan_get_context_state () at mm/kmsan/instrumentation.c:331 #2 0xffffffff81495671 in get_current () at ./arch/x86/include/asm/current.h:42 #3 rcu_preempt_read_enter () at kernel/rcu/tree_plugin.h:379 #4 __rcu_read_lock () at kernel/rcu/tree_plugin.h:402 #5 0xffffffff81b2054b in rcu_read_lock () at ./include/linux/rcupdate.h:748 torvalds#6 pfn_valid (pfn=<optimized out>) at ./include/linux/mmzone.h:2016 torvalds#7 kmsan_virt_addr_valid (addr=addr@entry=0xffffffff8620d974 <init_task+1012>) at ./arch/x86/include/asm/kmsan.h:82 torvalds#8 virt_to_page_or_null (vaddr=vaddr@entry=0xffffffff8620d974 <init_task+1012>) at mm/kmsan/shadow.c:75 torvalds#9 0xffffffff81b2023c in kmsan_get_metadata (address=0xffffffff8620d974 <init_task+1012>, is_origin=false) at mm/kmsan/shadow.c:143 torvalds#10 kmsan_get_shadow_origin_ptr (address=0xffffffff8620d974 <init_task+1012>, size=4, store=false) at mm/kmsan/shadow.c:97 torvalds#11 0xffffffff81b1dbd2 in get_shadow_origin_ptr (addr=0xffffffff8620d974 <init_task+1012>, size=4, store=false) at mm/kmsan/instrumentation.c:36 torvalds#12 __msan_metadata_ptr_for_load_4 (addr=0xffffffff8620d974 <init_task+1012>) at mm/kmsan/instrumentation.c:91 torvalds#13 0xffffffff8149568f in rcu_preempt_read_enter () at kernel/rcu/tree_plugin.h:379 torvalds#14 __rcu_read_lock () at kernel/rcu/tree_plugin.h:402 torvalds#15 0xffffffff81b2054b in rcu_read_lock () at ./include/linux/rcupdate.h:748 torvalds#16 pfn_valid (pfn=<optimized out>) at ./include/linux/mmzone.h:2016 torvalds#17 kmsan_virt_addr_valid (addr=addr@entry=0xffffffff8620d974 <init_task+1012>) at ./arch/x86/include/asm/kmsan.h:82 torvalds#18 virt_to_page_or_null (vaddr=vaddr@entry=0xffffffff8620d974 <init_task+1012>) at mm/kmsan/shadow.c:75 torvalds#19 0xffffffff81b2023c in kmsan_get_metadata (address=0xffffffff8620d974 <init_task+1012>, is_origin=false) at mm/kmsan/shadow.c:143 torvalds#20 kmsan_get_shadow_origin_ptr (address=0xffffffff8620d974 <init_task+1012>, size=4, store=false) at mm/kmsan/shadow.c:97 torvalds#21 0xffffffff81b1dbd2 in get_shadow_origin_ptr (addr=0xffffffff8620d974 <init_task+1012>, size=4, store=false) at mm/kmsan/instrumentation.c:36 torvalds#22 __msan_metadata_ptr_for_load_4 (addr=0xffffffff8620d974 <init_task+1012>) at mm/kmsan/instrumentation.c:91 torvalds#23 0xffffffff8149568f in rcu_preempt_read_enter () at kernel/rcu/tree_plugin.h:379 torvalds#24 __rcu_read_lock () at kernel/rcu/tree_plugin.h:402 torvalds#25 0xffffffff81b2054b in rcu_read_lock () at ./include/linux/rcupdate.h:748 torvalds#26 pfn_valid (pfn=<optimized out>) at ./include/linux/mmzone.h:2016 torvalds#27 kmsan_virt_addr_valid (addr=addr@entry=0xffffffff8620d974 <init_task+1012>) at ./arch/x86/include/asm/kmsan.h:82 torvalds#28 virt_to_page_or_null (vaddr=vaddr@entry=0xffffffff8620d974 <init_task+1012>) at mm/kmsan/shadow.c:75 torvalds#29 0xffffffff81b2023c in kmsan_get_metadata (address=0xffffffff8620d974 <init_task+1012>, is_origin=false) at mm/kmsan/shadow.c:143 torvalds#30 kmsan_get_shadow_origin_ptr (address=0xffffffff8620d974 <init_task+1012>, size=4, store=false) at mm/kmsan/shadow.c:97 torvalds#31 0xffffffff81b1dbd2 in get_shadow_origin_ptr (addr=0xffffffff8620d974 <init_task+1012>, size=4, store=false) at mm/kmsan/instrumentation.c:36 torvalds#32 __msan_metadata_ptr_for_load_4 (addr=0xffffffff8620d974 <init_task+1012>) at mm/kmsan/instrumentation.c:91 torvalds#33 0xffffffff8149568f in rcu_preempt_read_enter () at kernel/rcu/tree_plugin.h:379 torvalds#34 __rcu_read_lock () at kernel/rcu/tree_plugin.h:402 torvalds#35 0xffffffff81b2054b in rcu_read_lock () at ./include/linux/rcupdate.h:748 torvalds#36 pfn_valid (pfn=<optimized out>) at ./include/linux/mmzone.h:2016 torvalds#37 kmsan_virt_addr_valid (addr=addr@entry=0xffffffff8620d974 <init_task+1012>) at ./arch/x86/include/asm/kmsan.h:82 torvalds#38 virt_to_page_or_null (vaddr=vaddr@entry=0xffffffff8620d974 <init_task+1012>) at mm/kmsan/shadow.c:75 torvalds#39 0xffffffff81b2023c in kmsan_get_metadata (address=0xffffffff8620d974 <init_task+1012>, is_origin=false) at mm/kmsan/shadow.c:143 torvalds#40 kmsan_get_shadow_origin_ptr (address=0xffffffff8620d974 <init_task+1012>, size=4, store=false) at mm/kmsan/shadow.c:97 torvalds#41 0xffffffff81b1dbd2 in get_shadow_origin_ptr (addr=0xffffffff8620d974 <init_task+1012>, size=4, store=false) at mm/kmsan/instrumentation.c:36 torvalds#42 __msan_metadata_ptr_for_load_4 (addr=0xffffffff8620d974 <init_task+1012>) at mm/kmsan/instrumentation.c:91 torvalds#43 0xffffffff8149568f in rcu_preempt_read_enter () at kernel/rcu/tree_plugin.h:379 torvalds#44 __rcu_read_lock () at kernel/rcu/tree_plugin.h:402 torvalds#45 0xffffffff81b2054b in rcu_read_lock () at ./include/linux/rcupdate.h:748 torvalds#46 pfn_valid (pfn=<optimized out>) at ./include/linux/mmzone.h:2016 torvalds#47 kmsan_virt_addr_valid (addr=addr@entry=0xffffffff8620d974 <init_task+1012>) at ./arch/x86/include/asm/kmsan.h:82 torvalds#48 virt_to_page_or_null (vaddr=vaddr@entry=0xffffffff8620d974 <init_task+1012>) at mm/kmsan/shadow.c:75 torvalds#49 0xffffffff81b2023c in kmsan_get_metadata (address=0xffffffff8620d974 <init_task+1012>, is_origin=false) at mm/kmsan/shadow.c:143 torvalds#50 kmsan_get_shadow_origin_ptr (address=0xffffffff8620d974 <init_task+1012>, size=4, store=false) at mm/kmsan/shadow.c:97 torvalds#51 0xffffffff81b1dbd2 in get_shadow_origin_ptr (addr=0xffffffff8620d974 <init_task+1012>, size=4, store=false) at mm/kmsan/instrumentation.c:36 #52 __msan_metadata_ptr_for_load_4 (addr=0xffffffff8620d974 <init_task+1012>) at mm/kmsan/instrumentation.c:91 #53 0xffffffff8149568f in rcu_preempt_read_enter () at kernel/rcu/tree_plugin.h:379 torvalds#54 __rcu_read_lock () at kernel/rcu/tree_plugin.h:402 torvalds#55 0xffffffff81b2054b in rcu_read_lock () at ./include/linux/rcupdate.h:748 torvalds#56 pfn_valid (pfn=<optimized out>) at ./include/linux/mmzone.h:2016 torvalds#57 kmsan_virt_addr_valid (addr=addr@entry=0xffffffff8620d974 <init_task+1012>) at ./arch/x86/include/asm/kmsan.h:82 #58 virt_to_page_or_null (vaddr=vaddr@entry=0xffffffff8620d974 <init_task+1012>) at mm/kmsan/shadow.c:75 torvalds#59 0xffffffff81b2023c in kmsan_get_metadata (address=0xffffffff8620d974 <init_task+1012>, is_origin=false) at mm/kmsan/shadow.c:143 torvalds#60 kmsan_get_shadow_origin_ptr (address=0xffffffff8620d974 <init_task+1012>, size=4, store=false) at mm/kmsan/shadow.c:97 torvalds#61 0xffffffff81b1dbd2 in get_shadow_origin_ptr (addr=0xffffffff8620d974 <init_task+1012>, size=4, store=false) at mm/kmsan/instrumentation.c:36 torvalds#62 __msan_metadata_ptr_for_load_4 (addr=0xffffffff8620d974 <init_task+1012>) at mm/kmsan/instrumentation.c:91 torvalds#63 0xffffffff8149568f in rcu_preempt_read_enter () at kernel/rcu/tree_plugin.h:379 torvalds#64 __rcu_read_lock () at kernel/rcu/tree_plugin.h:402 torvalds#65 0xffffffff81b2054b in rcu_read_lock () at ./include/linux/rcupdate.h:748 torvalds#66 pfn_valid (pfn=<optimized out>) at ./include/linux/mmzone.h:2016 torvalds#67 kmsan_virt_addr_valid (addr=addr@entry=0xffffffff8620d974 <init_task+1012>) at ./arch/x86/include/asm/kmsan.h:82 torvalds#68 virt_to_page_or_null (vaddr=vaddr@entry=0xffffffff8620d974 <init_task+1012>) at mm/kmsan/shadow.c:75 torvalds#69 0xffffffff81b2023c in kmsan_get_metadata (address=0xffffffff8620d974 <init_task+1012>, is_origin=false) at mm/kmsan/shadow.c:143 #70 kmsan_get_shadow_origin_ptr (address=0xffffffff8620d974 <init_task+1012>, size=4, store=false) at mm/kmsan/shadow.c:97 torvalds#71 0xffffffff81b1dbd2 in get_shadow_origin_ptr (addr=0xffffffff8620d974 <init_task+1012>, size=4, store=false) at mm/kmsan/instrumentation.c:36 torvalds#72 __msan_metadata_ptr_for_load_4 (addr=0xffffffff8620d974 <init_task+1012>) at mm/kmsan/instrumentation.c:91 torvalds#73 0xffffffff8149568f in rcu_preempt_read_enter () at kernel/rcu/tree_plugin.h:379 torvalds#74 __rcu_read_lock () at kernel/rcu/tree_plugin.h:402 torvalds#75 0xffffffff81b2054b in rcu_read_lock () at ./include/linux/rcupdate.h:748 torvalds#76 pfn_valid (pfn=<optimized out>) at ./include/linux/mmzone.h:2016 torvalds#77 kmsan_virt_addr_valid (addr=addr@entry=0xffffffff86203c90) at ./arch/x86/include/asm/kmsan.h:82 torvalds#78 virt_to_page_or_null (vaddr=vaddr@entry=0xffffffff86203c90) at mm/kmsan/shadow.c:75 torvalds#79 0xffffffff81b2023c in kmsan_get_metadata (address=0xffffffff86203c90, is_origin=false) at mm/kmsan/shadow.c:143 torvalds#80 kmsan_get_shadow_origin_ptr (address=0xffffffff86203c90, size=8, store=false) at mm/kmsan/shadow.c:97 torvalds#81 0xffffffff81b1dc72 in get_shadow_origin_ptr (addr=0xffffffff8620d974 <init_task+1012>, size=8, store=false) at mm/kmsan/instrumentation.c:36 torvalds#82 __msan_metadata_ptr_for_load_8 (addr=0xffffffff8620d974 <init_task+1012>) at mm/kmsan/instrumentation.c:92 torvalds#83 0xffffffff814fdb9e in filter_irq_stacks (entries=<optimized out>, nr_entries=4) at kernel/stacktrace.c:397 torvalds#84 0xffffffff829520e8 in stack_depot_save_flags (entries=0xffffffff8620d974 <init_task+1012>, nr_entries=4, alloc_flags=0, depot_flags=0) at lib/stackdepot.c:500 torvalds#85 0xffffffff81b1e560 in __msan_poison_alloca (address=0xffffffff86203da0, size=24, descr=<optimized out>) at mm/kmsan/instrumentation.c:285 torvalds#86 0xffffffff8562821c in _printk (fmt=0xffffffff85f191a5 "\0016Attempting lock1") at kernel/printk/printk.c:2324 torvalds#87 0xffffffff81942aa2 in kmem_cache_create_usercopy (name=0xffffffff85f18903 "mm_struct", size=1296, align=0, flags=270336, useroffset=<optimized out>, usersize=<optimized out>, ctor=0x0 <fixed_percpu_data>) at mm/slab_common.c:296 torvalds#88 0xffffffff86f337a0 in mm_cache_init () at kernel/fork.c:3262 torvalds#89 0xffffffff86eacb8e in start_kernel () at init/main.c:932 torvalds#90 0xffffffff86ecdf94 in x86_64_start_reservations (real_mode_data=0x140e0 <exception_stacks+28896> <error: Cannot access memory at address 0x140e0>) at arch/x86/kernel/head64.c:555 torvalds#91 0xffffffff86ecde9b in x86_64_start_kernel (real_mode_data=0x140e0 <exception_stacks+28896> <error: Cannot access memory at address 0x140e0>) at arch/x86/kernel/head64.c:536 torvalds#92 0xffffffff810001d3 in secondary_startup_64 () at /pool/workspace/linux/arch/x86/kernel/head_64.S:461 torvalds#93 0x0000000000000000 in ??

Currently our IO accessors all use register addressing without offsets, but we could safely use offset addressing (without writeback) to simplify and optimize the generated code. To function correctly under a hypervisor which emulates IO accesses, we must ensure that any faulting/trapped IO access results in an ESR_ELx value with ESR_ELX.ISS.ISV=1 and with the tranfer register described in ESR_ELx.ISS.SRT. This means that we can only use loads/stores of a single general purpose register (or the zero register), and must avoid writeback addressing modes. However, we can use immediate offset addressing modes, as these still provide ESR_ELX.ISS.ISV=1 and a valid ESR_ELx.ISS.SRT when those accesses fault at Stage-2. Currently we only use register addressing without offsets. We use the "r" constraint to place the address into a register, and manually generate the register addressing by surrounding the resulting register operand with square braces, e.g. | static __always_inline void __raw_writeq(u64 val, volatile void __iomem *addr) | { | asm volatile("str %x0, [%1]" : : "rZ" (val), "r" (addr)); | } Due to this, sequences of adjacent accesses need to generate addresses using separate instructions. For example, the following code: | void writeq_zero_8_times(void *ptr) | { | writeq_relaxed(0, ptr + 8 * 0); | writeq_relaxed(0, ptr + 8 * 1); | writeq_relaxed(0, ptr + 8 * 2); | writeq_relaxed(0, ptr + 8 * 3); | writeq_relaxed(0, ptr + 8 * 4); | writeq_relaxed(0, ptr + 8 * 5); | writeq_relaxed(0, ptr + 8 * 6); | writeq_relaxed(0, ptr + 8 * 7); | } ... is compiled to: | <writeq_zero_8_times>: | str xzr, [x0] | add x1, x0, #0x8 | str xzr, [x1] | add x1, x0, #0x10 | str xzr, [x1] | add x1, x0, #0x18 | str xzr, [x1] | add x1, x0, #0x20 | str xzr, [x1] | add x1, x0, #0x28 | str xzr, [x1] | add x1, x0, #0x30 | str xzr, [x1] | add x0, x0, #0x38 | str xzr, [x0] | ret As described above, we could safely use immediate offset addressing, which would allow the ADDs to be folded into the address generation for the STRs, resulting in simpler and smaller generated assembly. We can do this by using the "o" constraint to allow the compiler to generate offset addressing (without writeback) for a memory operand, e.g. | static __always_inline void __raw_writeq(u64 val, volatile void __iomem *addr) | { | volatile u64 __iomem *ptr = addr; | asm volatile("str %x0, %1" : : "rZ" (val), "o" (*ptr)); | } ... which results in the earlier code sequence being compiled to: | <writeq_zero_8_times>: | str xzr, [x0] | str xzr, [x0, torvalds#8] | str xzr, [x0, torvalds#16] | str xzr, [x0, torvalds#24] | str xzr, [x0, torvalds#32] | str xzr, [x0, torvalds#40] | str xzr, [x0, torvalds#48] | str xzr, [x0, torvalds#56] | ret As Will notes at: https://lore.kernel.org/linux-arm-kernel/20240117160528.GA3398@willie-the-truck/ ... some compilers struggle with a plain "o" constraint, so it's preferable to use "Qo", where the additional "Q" constraint permits using non-offset register addressing. This patch modifies our IO write accessors to use "Qo" constraints, resulting in the better code generation described above. The IO read accessors are left as-is because ARM64_WORKAROUND_DEVICE_LOAD_ACQUIRE requires that non-offset register addressing is used, as the LDAR instruction does not support offset addressing. When compiling v6.8-rc1 defconfig with GCC 13.2.0, this saves ~4KiB of text: | [mark@lakrids:~/src/linux]% ls -al vmlinux-* | -rwxr-xr-x 1 mark mark 153960576 Jan 23 12:01 vmlinux-after | -rwxr-xr-x 1 mark mark 153862192 Jan 23 11:57 vmlinux-before | | [mark@lakrids:~/src/linux]% size vmlinux-before vmlinux-after | text data bss dec hex filename | 26708921 16690350 622736 44022007 29fb8f7 vmlinux-before | 26704761 16690414 622736 44017911 29fa8f7 vmlinux-after ... though due to internal alignment of sections, this has no impact on the size of the resulting Image: | [mark@lakrids:~/src/linux]% ls -al Image-* | -rw-r--r-- 1 mark mark 43590144 Jan 23 12:01 Image-after | -rw-r--r-- 1 mark mark 43590144 Jan 23 11:57 Image-before Aside from the better code generation, there should be no functional change as a result of this patch. I have lightly tested this patch, including booting under KVM (where some devices such as PL011 are emulated). Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: Jason Gunthorpe <jgg@nvidia.com> Cc: Marc Zyngier <maz@kernel.org> Cc: Will Deacon <will@kernel.org> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Acked-by: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20240124111259.874975-1-mark.rutland@arm.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>

Currently normal HugeTLB fault ends up crashing the kernel, as p4dp derived from p4d_offset() is an invalid address when PGTABLE_LEVEL = 5. A p4d level entry needs to be allocated when not available while walking the page table during HugeTLB faults. Let's call p4d_alloc() to allocate such entries when required instead of current p4d_offset(). Unable to handle kernel paging request at virtual address ffffffff80000000 Mem abort info: ESR = 0x0000000096000005 EC = 0x25: DABT (current EL), IL = 32 bits SET = 0, FnV = 0 EA = 0, S1PTW = 0 FSC = 0x05: level 1 translation fault Data abort info: ISV = 0, ISS = 0x00000005, ISS2 = 0x00000000 CM = 0, WnR = 0, TnD = 0, TagAccess = 0 GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0 swapper pgtable: 4k pages, 52-bit VAs, pgdp=0000000081da9000 [ffffffff80000000] pgd=1000000082cec003, p4d=0000000082c32003, pud=0000000000000000 Internal error: Oops: 0000000096000005 [#1] PREEMPT SMP Modules linked in: CPU: 1 PID: 108 Comm: high_addr_hugep Not tainted 6.9.0-rc4 torvalds#48 Hardware name: Foundation-v8A (DT) pstate: 01402005 (nzcv daif +PAN -UAO -TCO +DIT -SSBS BTYPE=--) pc : huge_pte_alloc+0xd4/0x334 lr : hugetlb_fault+0x1b8/0xc68 sp : ffff8000833bbc20 x29: ffff8000833bbc20 x28: fff000080080cb58 x27: ffff800082a7cc58 x26: 0000000000000000 x25: fff0000800378e40 x24: fff00008008d6c60 x23: 00000000de9dbf07 x22: fff0000800378e40 x21: 0004000000000000 x20: 0004000000000000 x19: ffffffff80000000 x18: 1ffe00010011d7a1 x17: 0000000000000001 x16: ffffffffffffffff x15: 0000000000000001 x14: 0000000000000000 x13: ffff8000816120d0 x12: ffffffffffffffff x11: 0000000000000000 x10: fff00008008ebd0c x9 : 0004000000000000 x8 : 0000000000001255 x7 : fff00008003e2000 x6 : 00000000061d54b0 x5 : 0000000000001000 x4 : ffffffff80000000 x3 : 0000000000200000 x2 : 0000000000000004 x1 : 0000000080000000 x0 : 0000000000000000 Call trace: huge_pte_alloc+0xd4/0x334 hugetlb_fault+0x1b8/0xc68 handle_mm_fault+0x260/0x29c do_page_fault+0xfc/0x47c do_translation_fault+0x68/0x74 do_mem_abort+0x44/0x94 el0_da+0x2c/0x9c el0t_64_sync_handler+0x70/0xc4 el0t_64_sync+0x190/0x194 Code: aa000084 cb010084 b24c2c84 8b130c93 (f9400260) ---[ end trace 0000000000000000 ]--- Cc: Will Deacon <will@kernel.org> Cc: Ard Biesheuvel <ardb@kernel.org> Cc: Ryan Roberts <ryan.roberts@arm.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: linux-arm-kernel@lists.infradead.org Cc: linux-kernel@vger.kernel.org Fixes: a6bbf5d ("arm64: mm: Add definitions to support 5 levels of paging") Reported-by: Dev Jain <dev.jain@arm.com> Acked-by: Ard Biesheuvel <ardb@kernel.org> Acked-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com> Reviewed-by: Ryan Roberts <ryan.roberts@arm.com> Link: https://lore.kernel.org/r/20240415094003.1812018-1-anshuman.khandual@arm.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>

The size of mcom03_hsperiph_conf_items array must be the same as 'num_custom_params' field of struct pinctrl_desc. Reading from file pinconf-groups in mcom03-hsperiph-pinctrl driver leads to this segmentation fault error: [ 46.250876] Unable to handle kernel paging request at virtual address 0000000100000080 [ 46.259775] Mem abort info: [ 46.262895] ESR = 0x96000005 [ 46.266308] EC = 0x25: DABT (current EL), IL = 32 bits [ 46.272363] SET = 0, FnV = 0 [ 46.275776] EA = 0, S1PTW = 0 [ 46.279307] Data abort info: [ 46.282526] ISV = 0, ISS = 0x00000005 [ 46.286810] CM = 0, WnR = 0 [ 46.290139] user pgtable: 4k pages, 39-bit VAs, pgdp=0000000906c34000 [ 46.297337] [0000000100000080] pgd=0000000000000000, p4d=0000000000000000, pud=0000000000000000 [ 46.307092] Internal error: Oops: 96000005 [#1] SMP [ 46.312541] Modules linked in: snd_soc_hdmi_codec motorcomm cfg80211 ahci mve_base(O) libahci mfbsp_can libata elcore50(O) can_dev mve_rsrc(O) elcore50_mmualloc(O) lontiume [ 46.360904] CPU: 3 PID: 261 Comm: cat Tainted: G O 5.10.179 torvalds#48 [ 46.368968] Hardware name: ELV-MC03-SMARC r2.6.1, ELV-SMARC-CB r2.10.3 (DT) [ 46.376745] pstate: 40000005 (nZcv daif -PAN -UAO -TCO BTYPE=--) [ 46.383465] pc : __pi_strlen+0x10/0x84 [ 46.387648] lr : seq_puts+0x2c/0x90 [ 46.391539] sp : ffffffc011c1bb20 [ 46.395237] x29: ffffffc011c1bb20 x28: ffffffc010d82c48 [ 46.401172] x27: 0000000000000000 x26: 0000000000000001 [ 46.407107] x25: ffffff8080a68000 x24: ffffffc011c1bc14 [ 46.413033] x23: 0000000000000002 x22: ffffff8082243078 [ 46.418966] x21: ffffffc010d83ca0 x20: 0000000100000080 [ 46.424900] x19: ffffff8082243078 x18: 0000000000000000 [ 46.430834] x17: 0000000000000000 x16: 0000000000000000 [ 46.436768] x15: ffffff8081e75830 x14: ffffffffffffffff [ 46.442702] x13: ffffff8086ee4000 x12: 0000000000000010 [ 46.448636] x11: 0101010101010101 x10: 7f7f7f7f7f7f7f7f [ 46.454571] x9 : ffffffc01053a7a8 x8 : 7f7f7f7f7f7f7f7f [ 46.460496] x7 : 0000000000000000 x6 : ffffff8086ee30cd [ 46.466430] x5 : 8080808080800000 x4 : 0000000000000000 [ 46.472364] x3 : 000000000000202c x2 : 00000000000000cd [ 46.478296] x1 : 0000000100000080 x0 : 0000000100000080 [ 46.484231] Call trace: [ 46.486965] __pi_strlen+0x10/0x84 [ 46.490766] pinconf_generic_dump_one+0x108/0x1e0 [ 46.496018] pinconf_generic_dump_pins+0xbc/0xe0 [ 46.501173] pinconf_groups_show+0xbc/0x110 [ 46.505843] seq_read_iter+0x1bc/0x460 [ 46.510027] seq_read+0xe4/0x140 [ 46.513631] full_proxy_read+0x68/0xc0 [ 46.517817] vfs_read+0xb4/0x1e0 [ 46.521418] ksys_read+0x74/0x110 [ 46.525117] __arm64_sys_read+0x24/0x30 [ 46.529403] do_el0_svc+0x8c/0x190 [ 46.533205] el0_svc+0x20/0x30 [ 46.536613] el0_sync_handler+0xb0/0xe0 [ 46.540895] el0_sync+0x180/0x1c0 [ 46.544600] Code: b200c3eb 927cec01 f2400c07 54000261 (a8c10c22) [ 46.551408] ---[ end trace ba437b61e1cf0ec6 ]--- Segmentation fault Issue: #MCOM03SW-2517 Change-Id: I669d3330e27a7f68fd71b030a32a47187508f263 Signed-off-by: Omar Al-Wadi <oalwadi@elvees.com>

This is to allow copying into the buffer from the application without the need to copy in ring context (and with that, the need that the ring task is active in kernel space). Also absolutely needed for now to avoid this teardown issue 1525.905504] KASAN: null-ptr-deref in range [0x00000000000001a0-0x00000000000001a7] [ 1525.910431] CPU: 15 PID: 183 Comm: kworker/15:1 Tainted: G O 6.10.0+ torvalds#48 [ 1525.916449] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.3-debian-1.16.3-2 04/01/2014 [ 1525.922470] Workqueue: events io_fallback_req_func [ 1525.925840] RIP: 0010:__lock_acquire+0x74/0x7b80 [ 1525.929010] Code: 89 bc 24 80 00 00 00 0f 85 1c 5f 00 00 83 3d 6e 80 b0 02 00 0f 84 1d 12 00 00 83 3d 65 c7 67 02 00 74 27 48 89 f8 48 c1 e8 03 <42> 80 3c 30 00 74 0d e8 50 44 42 00 48 8b bc 24 80 00 00 00 48 c7 [ 1525.942211] RSP: 0018:ffff88810b2af490 EFLAGS: 00010002 [ 1525.945672] RAX: 0000000000000034 RBX: 0000000000000000 RCX: 0000000000000001 [ 1525.950421] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 00000000000001a0 [ 1525.955200] RBP: 0000000000000000 R08: 0000000000000001 R09: 0000000000000000 [ 1525.959979] R10: dffffc0000000000 R11: fffffbfff07b1cbe R12: 0000000000000000 [ 1525.964252] R13: 0000000000000001 R14: dffffc0000000000 R15: 0000000000000001 [ 1525.968225] FS: 0000000000000000(0000) GS:ffff88875b200000(0000) knlGS:0000000000000000 [ 1525.973932] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 1525.976694] CR2: 00005555b6a381f0 CR3: 000000012f5f1000 CR4: 00000000000006f0 [ 1525.980030] Call Trace: [ 1525.981371] <TASK> [ 1525.982567] ? __die_body+0x66/0xb0 [ 1525.984376] ? die_addr+0xc1/0x100 [ 1525.986111] ? exc_general_protection+0x1c6/0x330 [ 1525.988401] ? asm_exc_general_protection+0x22/0x30 [ 1525.990864] ? __lock_acquire+0x74/0x7b80 [ 1525.992901] ? mark_lock+0x9f/0x360 [ 1525.994635] ? __lock_acquire+0x1420/0x7b80 [ 1525.996629] ? attach_entity_load_avg+0x47d/0x550 [ 1525.998765] ? hlock_conflict+0x5a/0x1f0 [ 1526.000515] ? __bfs+0x2dc/0x5a0 [ 1526.001993] lock_acquire+0x1fb/0x3d0 [ 1526.004727] ? gup_fast_fallback+0x13f/0x1d80 [ 1526.006586] ? gup_fast_fallback+0x13f/0x1d80 [ 1526.008412] gup_fast_fallback+0x158/0x1d80 [ 1526.010170] ? gup_fast_fallback+0x13f/0x1d80 [ 1526.011999] ? __lock_acquire+0x2b07/0x7b80 [ 1526.013793] __iov_iter_get_pages_alloc+0x36e/0x980 [ 1526.015876] ? do_raw_spin_unlock+0x5a/0x8a0 [ 1526.017734] iov_iter_get_pages2+0x56/0x70 [ 1526.019491] fuse_copy_fill+0x48e/0x980 [fuse] [ 1526.021400] fuse_copy_args+0x174/0x6a0 [fuse] [ 1526.023199] fuse_uring_prepare_send+0x319/0x6c0 [fuse] [ 1526.025178] fuse_uring_send_req_in_task+0x42/0x100 [fuse] [ 1526.027163] io_fallback_req_func+0xb4/0x170 [ 1526.028737] ? process_scheduled_works+0x75b/0x1160 [ 1526.030445] process_scheduled_works+0x85c/0x1160 [ 1526.032073] worker_thread+0x8ba/0xce0 [ 1526.033388] kthread+0x23e/0x2b0 [ 1526.035404] ? pr_cont_work_flush+0x290/0x290 [ 1526.036958] ? kthread_blkcg+0xa0/0xa0 [ 1526.038321] ret_from_fork+0x30/0x60 [ 1526.039600] ? kthread_blkcg+0xa0/0xa0 [ 1526.040942] ret_from_fork_asm+0x11/0x20 [ 1526.042353] </TASK>

This is to allow copying into the buffer from the application without the need to copy in ring context (and with that, the need that the ring task is active in kernel space). Also absolutely needed for now to avoid this teardown issue 1525.905504] KASAN: null-ptr-deref in range [0x00000000000001a0-0x00000000000001a7] [ 1525.910431] CPU: 15 PID: 183 Comm: kworker/15:1 Tainted: G O 6.10.0+ torvalds#48 [ 1525.916449] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.3-debian-1.16.3-2 04/01/2014 [ 1525.922470] Workqueue: events io_fallback_req_func [ 1525.925840] RIP: 0010:__lock_acquire+0x74/0x7b80 [ 1525.929010] Code: 89 bc 24 80 00 00 00 0f 85 1c 5f 00 00 83 3d 6e 80 b0 02 00 0f 84 1d 12 00 00 83 3d 65 c7 67 02 00 74 27 48 89 f8 48 c1 e8 03 <42> 80 3c 30 00 74 0d e8 50 44 42 00 48 8b bc 24 80 00 00 00 48 c7 [ 1525.942211] RSP: 0018:ffff88810b2af490 EFLAGS: 00010002 [ 1525.945672] RAX: 0000000000000034 RBX: 0000000000000000 RCX: 0000000000000001 [ 1525.950421] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 00000000000001a0 [ 1525.955200] RBP: 0000000000000000 R08: 0000000000000001 R09: 0000000000000000 [ 1525.959979] R10: dffffc0000000000 R11: fffffbfff07b1cbe R12: 0000000000000000 [ 1525.964252] R13: 0000000000000001 R14: dffffc0000000000 R15: 0000000000000001 [ 1525.968225] FS: 0000000000000000(0000) GS:ffff88875b200000(0000) knlGS:0000000000000000 [ 1525.973932] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 1525.976694] CR2: 00005555b6a381f0 CR3: 000000012f5f1000 CR4: 00000000000006f0 [ 1525.980030] Call Trace: [ 1525.981371] <TASK> [ 1525.982567] ? __die_body+0x66/0xb0 [ 1525.984376] ? die_addr+0xc1/0x100 [ 1525.986111] ? exc_general_protection+0x1c6/0x330 [ 1525.988401] ? asm_exc_general_protection+0x22/0x30 [ 1525.990864] ? __lock_acquire+0x74/0x7b80 [ 1525.992901] ? mark_lock+0x9f/0x360 [ 1525.994635] ? __lock_acquire+0x1420/0x7b80 [ 1525.996629] ? attach_entity_load_avg+0x47d/0x550 [ 1525.998765] ? hlock_conflict+0x5a/0x1f0 [ 1526.000515] ? __bfs+0x2dc/0x5a0 [ 1526.001993] lock_acquire+0x1fb/0x3d0 [ 1526.004727] ? gup_fast_fallback+0x13f/0x1d80 [ 1526.006586] ? gup_fast_fallback+0x13f/0x1d80 [ 1526.008412] gup_fast_fallback+0x158/0x1d80 [ 1526.010170] ? gup_fast_fallback+0x13f/0x1d80 [ 1526.011999] ? __lock_acquire+0x2b07/0x7b80 [ 1526.013793] __iov_iter_get_pages_alloc+0x36e/0x980 [ 1526.015876] ? do_raw_spin_unlock+0x5a/0x8a0 [ 1526.017734] iov_iter_get_pages2+0x56/0x70 [ 1526.019491] fuse_copy_fill+0x48e/0x980 [fuse] [ 1526.021400] fuse_copy_args+0x174/0x6a0 [fuse] [ 1526.023199] fuse_uring_prepare_send+0x319/0x6c0 [fuse] [ 1526.025178] fuse_uring_send_req_in_task+0x42/0x100 [fuse] [ 1526.027163] io_fallback_req_func+0xb4/0x170 [ 1526.028737] ? process_scheduled_works+0x75b/0x1160 [ 1526.030445] process_scheduled_works+0x85c/0x1160 [ 1526.032073] worker_thread+0x8ba/0xce0 [ 1526.033388] kthread+0x23e/0x2b0 [ 1526.035404] ? pr_cont_work_flush+0x290/0x290 [ 1526.036958] ? kthread_blkcg+0xa0/0xa0 [ 1526.038321] ret_from_fork+0x30/0x60 [ 1526.039600] ? kthread_blkcg+0xa0/0xa0 [ 1526.040942] ret_from_fork_asm+0x11/0x20 [ 1526.042353] </TASK> Signed-off-by: Bernd Schubert <bschubert@ddn.com>

noelbk added 3 commits September 10, 2013 17:30

added periodictimer to mrp to allow retries when packets get lost

e2fdc26

cleaned up debug prints before committing back to master

2bc08ce

clean up before pull request

5a704fa

noelbk closed this Feb 6, 2015

aejsmith pushed a commit to aejsmith/linux that referenced this pull request Jul 20, 2015

Merge pull request torvalds#48 from mpredfearn/ci20-v3.18-hdmi

c5c5b90

Ci20 v3.18 Add audio over hdmi + mono sound fixes

logic10492 pushed a commit to logic10492/linux-amd-zen2 that referenced this pull request Jan 18, 2024

Merge pull request torvalds#48 from sched-ext/rusty-keep-bpf-o

664d650

scx_rusty: keep .bpf.o files for debugging

logic10492 pushed a commit to logic10492/linux-amd-zen2 that referenced this pull request Jan 18, 2024

Revert "Merge pull request torvalds#48 from sched-ext/rusty-keep-bpf-o"

997c450

This reverts commit 664d650, reversing changes made to ee9077a.

logic10492 pushed a commit to logic10492/linux-amd-zen2 that referenced this pull request Jan 18, 2024

Merge pull request torvalds#49 from sched-ext/rusty-keep-bpf-o

2a3532d

Fix incorrect merge of torvalds#48

ziqiaozhou added a commit to ziqiaozhou/linux that referenced this pull request Apr 29, 2024

Support doorbell APIC EOI (torvalds#48)

6533af5

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add periodictimer to net/MRP to enable retries #48

Add periodictimer to net/MRP to enable retries #48

noelbk commented Sep 16, 2013

torvalds commented Sep 16, 2013

noelbk commented Sep 16, 2013

Add periodictimer to net/MRP to enable retries #48

Add periodictimer to net/MRP to enable retries #48

Conversation

noelbk commented Sep 16, 2013

torvalds commented Sep 16, 2013

noelbk commented Sep 16, 2013

Cheers,