New subflows send MP_JOIN to port 0 #63

matttbe · 2020-07-22T12:10:17Z

More info are coming soon but we can see in different setup that MP_JOIN are sent to port 0:

12:02:25.985960 IP 10.0.1.2.8000 > 10.0.1.1.59574: Flags [.], ack 78, win 509, options [nop,nop,TS val 58365988 ecr 260327143,mptcp add-addr[bad opt]>
12:02:25.986311 IP 10.0.2.1.46531 > 10.0.2.2.0: Flags [S], seq 1204877313, win 64240, options [mss 1460,sackOK,TS val 2027756336 ecr 0,nop,wscale 7,mptcp join backup id 0 token 0xc777f628 nonce
0xacb9a776], length 0
12:02:25.986322 IP 10.0.2.2.0 > 10.0.2.1.46531: Flags [R.], seq 0, ack 1204877314, win 0, length 0

Or even with packetdrill:

root@(none):/opt/packetdrill/gtests/net/mptcp/mp_join# tcpdump -i any -n -c 20 tcp &
[1] 260
root@(none):/opt/packetdrill/gtests/net/mptcp/mp_join# tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on any, link-type LINUX_SLL (Linux cooked), capture size 262144 bytes

root@(none):/opt/packetdrill/gtests/net/mptcp/mp_join# ../../packetdrill/packetdrill -vvv mp_join_client.pkt 
socket syscall: 1594826109.716840
setsockopt syscall: 1594826109.719996
fcntl syscall: 1594826109.723111
fcntl syscall: 1594826109.725905
connect syscall: 1594826109.729016
outbound sniffed packet:  0.089384 S 3824942547:3824942547(0) win 65535 <mss 1460,sackOK,TS val 978433635 ecr 0,nop,wscale 8,mp_capable v1 flags: |H| >
inbound injected packet:  0.102111 S. 0:0(0) ack 3824942548 win 65535 <mss 1460,sackOK,TS val 4074410674 ecr 978433635,nop,wscale 8,mp_capable v1 flags: |H| sender_key: 2>
outbound sniffed packet:  0.112000 . 3824942548:3824942548(0) ack 1 win 256 <nop,nop,TS val 978433658 ecr 4074410674,mp_capable v1 flags: |H| sender_key: 15724967926438798442 receiver_key: 2>
15:15:09.728921 IP 192.168.228.105.47794 > 192.0.2.1.8080: Flags [S], seq 3824942547, win 65535, options [mss 1460,sackOK,TS val 978433635 ecr 0,nop,wscale 8,mptcp capable[bad opt]>
15:15:09.751397 IP 192.0.2.1.8080 > 192.168.228.105.47794: Flags [S.], seq 0, ack 3824942548, win 65535, options [mss 1460,sackOK,TS val 4074410674 ecr 978433635,nop,wscale 8,mptcp capable Unknown Version (1)], length 0
15:15:09.751537 IP 192.168.228.105.47794 > 192.0.2.1.8080: Flags [.], ack 1, win 256, options [nop,nop,TS val 978433658 ecr 4074410674,mptcp capable Unknown Version (1)], length 0
getsockopt syscall: 1594826109.963115
fcntl syscall: 1594826109.966067
15:15:10.968919 IP 192.168.228.105.47794 > 192.0.2.1.8080: Flags [P.], seq 1:3, ack 1, win 256, options [nop,nop,TS val 978434875 ecr 4074410674,mptcp capable[bad opt]>
write syscall: 1594826110.979351
outbound sniffed packet:  1.329382 P. 3824942548:3824942550(2) ack 1 win 256 <nop,nop,TS val 978434875 ecr 4074410674,mp_capable v1 flags: |H| sender_key: 15724967926438798442 receiver_key: 2 mpcdatalen=2,nop,nop>
inbound injected packet:  1.356829 . 1:1(0) ack 3824942550 win 256 <nop,nop,TS val 4074418293 ecr 978434875,add_address address_id: 1 ipv4: 192.0.2.2 hmac: 18175360766677029581,dss dack8 9168515192191584501 flags: Aa>
15:15:11.009981 IP 192.0.2.1.8080 > 192.168.228.105.47794: Flags [.], ack 3, win 256, options [nop,nop,TS val 4074418293 ecr 978434875,mptcp add-addr[bad opt]>
15:15:11.019680 IP 192.168.228.105.35147 > 192.0.2.2.0: Flags [S], seq 2261540031, win 65535, options [mss 1460,sackOK,TS val 3362971286 ecr 0,nop,wscale 8,mptcp join backup id 0 token 0xd86e8112 nonce 0x91218654], length 0
15:15:12.059316 IP 192.168.228.105.35147 > 192.0.2.2.0: Flags [S], seq 2261540031, win 65535, options [mss 1460,sackOK,TS val 3362972326 ecr 0,nop,wscale 8,mptcp join backup id 0 token 0xd86e8112 nonce 0x91218654], length 0
15:15:14.107395 IP 192.168.228.105.35147 > 192.0.2.2.0: Flags [S], seq 2261540031, win 65535, options [mss 1460,sackOK,TS val 3362974374 ecr 0,nop,wscale 8,mptcp join backup id 0 token 0xd86e8112 nonce 0x91218654], length 0
15:15:18.139331 IP 192.168.228.105.35147 > 192.0.2.2.0: Flags [S], seq 2261540031, win 65535, options [mss 1460,sackOK,TS val 3362978406 ecr 0,nop,wscale 8,mptcp join backup id 0 token 0xd86e8112 nonce 0x91218654], length 0

(Note: tcpdump version is old, not support MPTCPv1)

The text was updated successfully, but these errors were encountered:

nrybowski · 2020-07-22T14:17:32Z

Here is a minimal setup to reproduce this bug (mptcp-tools is supposed to be in the same folder than this script and the use_mptcp tool compiled) :

#! /bin/bash

setup_iface() {
    ns="$4"
    ns_exec="ip netns exec $ns"

    ip l set "veth$1" netns "$ns"
    $ns_exec ip l set dev veth"$1" up  
    $ns_exec ip a add dev veth"$1" 10.0."$2"."$3"/24
}

cgroups=("client" "server")
use_mptcp="./mptcp-tools/use_mptcp/use_mptcp.sh"

ip l add veth1 type veth peer name veth2
ip l add veth3 type veth peer name veth4

i=1
for cgroup in ${cgroups[@]}
do
    ns_name=ns_$cgroup
    ns_exec="ip netns exec $ns_name"
    ip netns list | grep $ns_name > /dev/null
    if [ $? -eq 1 ]
    then
        ip netns add $ns_name 

        setup_iface "$i" "1" "$i" "$ns_name" 
        setup_iface "$((i+2))" "2" "$i" "$ns_name"
        $ns_exec ip mptcp endpoint flush
        $ns_exec ip mptcp limits set add_addr_accepted 2 subflows 2
    fi

    if [ "${cgroup}" = "server" ]
    then
            addrs=$($ns_exec ip a | grep inet | sed -e 's/inet[6]*//g' -e 's/fe80.*$//g' -e 's/\/24.*$//g')
            echo "${addrs[@]}"
     	    #for addr in ${addrs[@]}
            #do
            #   ${ns_exec} ip mptcp endpoint add ${addr} signal
            #done
            ${ns_exec} ip mptcp endpoint add ${addrs[0]} signal
            ${ns_exec} ${use_mptcp} python3 -m http.server &
            #${ns_exec} tc qdisc add dev veth$i root netem delay 1000ms 
    fi

    ((i++))

done

In shell 1 : ip netns exec ns_server tcpdump -ni any tcp and in shell 2 : ip netns exec ns_client mptcp-tools/use_mptcp/use_mptcp.sh curl 10.0.1.2:8000 -o /dev/null.

From the tcpdump :

[...]
14:08:32.077470 IP 10.0.1.2.8000 > 10.0.1.1.41760: Flags [.], ack 78, win 509, options [nop,nop,TS val 1413864222 ecr 3687490168,mptcp add-addr[bad opt]>
14:08:32.077841 IP 10.0.2.1.43719 > 10.0.2.2.0: Flags [S], seq 2048241466, win 64240, options [mss 1460,sackOK,TS val 2245203070 ecr 0,nop,wscale 7,mptcp join backup id 0 token 0xe131e65a nonce
0x763548b7], length 0
14:08:32.077850 IP 10.0.2.2.0 > 10.0.2.1.43719: Flags [R.], seq 0, ack 2048241467, win 0, length 0
[...]

I'm not sure if add signal has to be called on both the addresses of ns_server but when I tried (the for loop in the above script) I got the same bug but on the first interface :

[...]
14:11:47.055974 IP 10.0.1.2.8000 > 10.0.1.1.41762: Flags [.], ack 78, win 509, options [nop,nop,TS val 1414059201 ecr 3687685147,mptcp add-addr[bad opt]>
14:11:47.056372 IP 10.0.1.1.41485 > 10.0.1.2.0: Flags [S], seq 3336345692, win 64240, options [mss 1460,sackOK,TS val 3687685147 ecr 0,nop,wscale 7,mptcp join backup id 0 token 0x27143ab nonce 0xf3a22c5c], length 0
14:11:47.056381 IP 10.0.1.2.0 > 10.0.1.1.41485: Flags [R.], seq 0, ack 3336345693, win 0, length 0
[...]

Tested on commit eeb8340.

matttbe · 2020-07-27T15:37:55Z

Arf, I forgot to add "Closes #63" in the commit message of my last patch.

This is fixed in the export branch. It has been sent to netdev for -net branch.

Chipidea also need sync interrupt before unbind the udc while gadget remove driver, otherwise setup irq handling may happen while unbind, see below dump generated from android function switch stress test: [ 4703.503056] android_work: sent uevent USB_STATE=CONNECTED [ 4703.514642] android_work: sent uevent USB_STATE=DISCONNECTED [ 4703.651339] android_work: sent uevent USB_STATE=CONNECTED [ 4703.661806] init: Control message: Processed ctl.stop for 'adbd' from pid: 561 (system_server) [ 4703.673469] init: processing action (init.svc.adbd=stopped) from (/system/etc/init/hw/init.usb.configfs.rc:14) [ 4703.676451] Unable to handle kernel read from unreadable memory at virtual address 0000000000000090 [ 4703.676454] Mem abort info: [ 4703.676458] ESR = 0x96000004 [ 4703.676461] EC = 0x25: DABT (current EL), IL = 32 bits [ 4703.676464] SET = 0, FnV = 0 [ 4703.676466] EA = 0, S1PTW = 0 [ 4703.676468] Data abort info: [ 4703.676471] ISV = 0, ISS = 0x00000004 [ 4703.676473] CM = 0, WnR = 0 [ 4703.676478] user pgtable: 4k pages, 48-bit VAs, pgdp=000000004a867000 [ 4703.676481] [0000000000000090] pgd=0000000000000000, p4d=0000000000000000 [ 4703.676503] Internal error: Oops: 96000004 [#1] PREEMPT SMP [ 4703.758297] Modules linked in: synaptics_dsx_i2c moal(O) mlan(O) [ 4703.764327] CPU: 0 PID: 235 Comm: lmkd Tainted: G W O 5.10.9-00001-g3f5fd8487c38-dirty multipath-tcp#63 [ 4703.773720] Hardware name: NXP i.MX8MNano EVK board (DT) [ 4703.779033] pstate: 60400085 (nZCv daIf +PAN -UAO -TCO BTYPE=--) [ 4703.785046] pc : _raw_write_unlock_bh+0xc0/0x2c8 [ 4703.789667] lr : android_setup+0x4c/0x168 [ 4703.793676] sp : ffff80001256bd80 [ 4703.796989] x29: ffff80001256bd80 x28: 00000000000000a8 [ 4703.802304] x27: ffff800012470000 x26: ffff80006d923000 [ 4703.807616] x25: ffff800012471000 x24: ffff00000b091140 [ 4703.812929] x23: ffff0000077dbd38 x22: ffff0000077da490 [ 4703.818242] x21: ffff80001256be30 x20: 0000000000000000 [ 4703.823554] x19: 0000000000000080 x18: ffff800012561048 [ 4703.828867] x17: 0000000000000000 x16: 0000000000000039 [ 4703.834180] x15: ffff8000106ad258 x14: ffff80001194c277 [ 4703.839493] x13: 0000000000003934 x12: 0000000000000000 [ 4703.844805] x11: 0000000000000000 x10: 0000000000000001 [ 4703.850117] x9 : 0000000000000000 x8 : 0000000000000090 [ 4703.855429] x7 : 6f72646e61203a70 x6 : ffff8000124f2450 [ 4703.860742] x5 : ffffffffffffffff x4 : 0000000000000009 [ 4703.866054] x3 : ffff8000108a290c x2 : ffff00007fb3a9c8 [ 4703.871367] x1 : 0000000000000000 x0 : 0000000000000090 [ 4703.876681] Call trace: [ 4703.879129] _raw_write_unlock_bh+0xc0/0x2c8 [ 4703.883397] android_setup+0x4c/0x168 [ 4703.887059] udc_irq+0x824/0xa9c [ 4703.890287] ci_irq+0x124/0x148 [ 4703.893429] __handle_irq_event_percpu+0x84/0x268 [ 4703.898131] handle_irq_event+0x64/0x14c [ 4703.902054] handle_fasteoi_irq+0x110/0x210 [ 4703.906236] __handle_domain_irq+0x8c/0xd4 [ 4703.910332] gic_handle_irq+0x6c/0x124 [ 4703.914081] el1_irq+0xdc/0x1c0 [ 4703.917221] _raw_spin_unlock_irq+0x20/0x54 [ 4703.921405] finish_task_switch+0x84/0x224 [ 4703.925502] __schedule+0x4a4/0x734 [ 4703.928990] schedule+0xa0/0xe8 [ 4703.932132] do_notify_resume+0x150/0x184 [ 4703.936140] work_pending+0xc/0x40c [ 4703.939633] Code: d5384613 521b0a69 d5184609 f9800111 (885ffd01) [ 4703.945732] ---[ end trace ba5c1875ae49d53c ]--- [ 4703.950350] Kernel panic - not syncing: Oops: Fatal exception in interrupt [ 4703.957223] SMP: stopping secondary CPUs [ 4703.961151] Kernel Offset: disabled [ 4703.964638] CPU features: 0x0240002,2000200c [ 4703.968905] Memory Limit: none [ 4703.971963] Rebooting in 5 seconds.. Tested-by: faqiang.zhu <faqiang.zhu@nxp.com> Signed-off-by: Li Jun <jun.li@nxp.com> Link: https://lore.kernel.org/r/1620989984-7653-1-git-send-email-jun.li@nxp.com Signed-off-by: Peter Chen <peter.chen@kernel.org>

ksmbd-fixes

…together Running endpoint security solutions like Sentinel1 that use perf-based tracing heavily lead to this repeated dump complaining about dockerd. The default value of 2048 is nowhere near not large enough. Using the prior patch "tracing: show size of requested buffer", we get "perf buffer not large enough, wanted 6644, have 6144", after repeated up-sizing (I did 2/4/6/8K). With 8K, the problem doesn't occur at all, so below is the trace for 6K. I'm wondering if this value should be selectable at boot time, but this is a good starting point. ``` ------------[ cut here ]------------ perf buffer not large enough, wanted 6644, have 6144 WARNING: CPU: 1 PID: 4997 at kernel/trace/trace_event_perf.c:402 perf_trace_buf_alloc+0x8c/0xa0 Modules linked in: [..] CPU: 1 PID: 4997 Comm: sh Tainted: G T 5.13.13-x86_64-00039-gb3959163488e #63 Hardware name: LENOVO 20KH002JUS/20KH002JUS, BIOS N23ET66W (1.41 ) 09/02/2019 RIP: 0010:perf_trace_buf_alloc+0x8c/0xa0 Code: 80 3d 43 97 d0 01 00 74 07 31 c0 5b 5d 41 5c c3 ba 00 18 00 00 89 ee 48 c7 c7 00 82 7d 91 c6 05 25 97 d0 01 01 e8 22 ee bc 00 <0f> 0b 31 c0 eb db 66 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 00 55 89 RSP: 0018:ffffb922026b7d58 EFLAGS: 00010282 RAX: 0000000000000000 RBX: ffff9da5ee012000 RCX: 0000000000000027 RDX: ffff9da881657828 RSI: 0000000000000001 RDI: ffff9da881657820 RBP: 00000000000019f4 R08: 0000000000000000 R09: ffffb922026b7b80 R10: ffffb922026b7b78 R11: ffffffff91dda688 R12: 000000000000000f R13: ffff9da5ee012108 R14: ffff9da8816570a0 R15: ffffb922026b7e30 FS: 00007f420db1a080(0000) GS:ffff9da881640000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000060 CR3: 00000002504a8006 CR4: 00000000003706e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: kprobe_perf_func+0x11e/0x270 ? do_execveat_common.isra.0+0x1/0x1c0 ? do_execveat_common.isra.0+0x5/0x1c0 kprobe_ftrace_handler+0x10e/0x1d0 0xffffffffc03aa0c8 ? do_execveat_common.isra.0+0x1/0x1c0 do_execveat_common.isra.0+0x5/0x1c0 __x64_sys_execve+0x33/0x40 do_syscall_64+0x6b/0xc0 ? do_syscall_64+0x11/0xc0 entry_SYSCALL_64_after_hwframe+0x44/0xae RIP: 0033:0x7f420dc1db37 Code: ff ff 76 e7 f7 d8 64 41 89 00 eb df 0f 1f 80 00 00 00 00 f7 d8 64 41 89 00 eb dc 0f 1f 84 00 00 00 00 00 b8 3b 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 01 43 0f 00 f7 d8 64 89 01 48 RSP: 002b:00007ffd4e8b4e38 EFLAGS: 00000246 ORIG_RAX: 000000000000003b RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f420dc1db37 RDX: 0000564338d1e740 RSI: 0000564338d32d50 RDI: 0000564338d28f00 RBP: 0000564338d28f00 R08: 0000564338d32d50 R09: 0000000000000020 R10: 00000000000001b6 R11: 0000000000000246 R12: 0000564338d28f00 R13: 0000564338d32d50 R14: 0000564338d1e740 R15: 0000564338d28c60 ---[ end trace 83ab3e8e16275e49 ]--- ``` Link: https://lkml.kernel.org/r/20210831043723.13481-2-robbat2@gentoo.org Signed-off-by: Robin H. Johnson <robbat2@gentoo.org> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>

Large pkt_len can lead to out-out-bound memcpy. Current ath9k_hif_usb_rx_stream allows combining the content of two urb inputs to one pkt. The first input can indicate the size of the pkt. Any remaining size is saved in hif_dev->rx_remain_len. While processing the next input, memcpy is used with rx_remain_len. 4-byte pkt_len can go up to 0xffff, while a single input is 0x4000 maximum in size (MAX_RX_BUF_SIZE). Thus, the patch adds a check for pkt_len which must not exceed 2 * MAX_RX_BUG_SIZE. BUG: KASAN: slab-out-of-bounds in ath9k_hif_usb_rx_cb+0x490/0xed7 [ath9k_htc] Read of size 46393 at addr ffff888018798000 by task kworker/0:1/23 CPU: 0 PID: 23 Comm: kworker/0:1 Not tainted 5.6.0 #63 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.10.2-0-g5f4c7b1-prebuilt.qemu-project.org 04/01/2014 Workqueue: events request_firmware_work_func Call Trace: <IRQ> dump_stack+0x76/0xa0 print_address_description.constprop.0+0x16/0x200 ? ath9k_hif_usb_rx_cb+0x490/0xed7 [ath9k_htc] ? ath9k_hif_usb_rx_cb+0x490/0xed7 [ath9k_htc] __kasan_report.cold+0x37/0x7c ? ath9k_hif_usb_rx_cb+0x490/0xed7 [ath9k_htc] kasan_report+0xe/0x20 check_memory_region+0x15a/0x1d0 memcpy+0x20/0x50 ath9k_hif_usb_rx_cb+0x490/0xed7 [ath9k_htc] ? hif_usb_mgmt_cb+0x2d9/0x2d9 [ath9k_htc] ? _raw_spin_lock_irqsave+0x7b/0xd0 ? _raw_spin_trylock_bh+0x120/0x120 ? __usb_unanchor_urb+0x12f/0x210 __usb_hcd_giveback_urb+0x1e4/0x380 usb_giveback_urb_bh+0x241/0x4f0 ? __hrtimer_run_queues+0x316/0x740 ? __usb_hcd_giveback_urb+0x380/0x380 tasklet_action_common.isra.0+0x135/0x330 __do_softirq+0x18c/0x634 irq_exit+0x114/0x140 smp_apic_timer_interrupt+0xde/0x380 apic_timer_interrupt+0xf/0x20 I found the bug using a custome USBFuzz port. It's a research work to fuzz USB stack/drivers. I modified it to fuzz ath9k driver only, providing hand-crafted usb descriptors to QEMU. After fixing the value of pkt_tag to ATH_USB_RX_STREAM_MODE_TAG in QEMU emulation, I found the KASAN report. The bug is triggerable whenever pkt_len is above two MAX_RX_BUG_SIZE. I used the same input that crashes to test the driver works when applying the patch. Signed-off-by: Zekun Shen <bruceshenzk@gmail.com> Signed-off-by: Kalle Valo <quic_kvalo@quicinc.com> Link: https://lore.kernel.org/r/YXsidrRuK6zBJicZ@10-18-43-117.dynapool.wireless.nyu.edu

If a socket bound to a wildcard address fails to connect(), we only reset saddr and keep the port. Then, we have to fix up the bhash2 bucket; otherwise, the bucket has an inconsistent address in the list. Also, listen() for such a socket will fire the WARN_ON() in inet_csk_get_port(). [0] Note that when a system runs out of memory, we give up fixing the bucket and unlink sk from bhash and bhash2 by inet_put_port(). [0]: WARNING: CPU: 0 PID: 207 at net/ipv4/inet_connection_sock.c:548 inet_csk_get_port (net/ipv4/inet_connection_sock.c:548 (discriminator 1)) Modules linked in: CPU: 0 PID: 207 Comm: bhash2_prev_rep Not tainted 6.1.0-rc3-00799-gc8421681c845 #63 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.0-1.amzn2022.0.1 04/01/2014 RIP: 0010:inet_csk_get_port (net/ipv4/inet_connection_sock.c:548 (discriminator 1)) Code: 74 a7 eb 93 48 8b 54 24 18 0f b7 cb 4c 89 e6 4c 89 ff e8 48 b2 ff ff 49 8b 87 18 04 00 00 e9 32 ff ff ff 0f 0b e9 34 ff ff ff <0f> 0b e9 42 ff ff ff 41 8b 7f 50 41 8b 4f 54 89 fe 81 f6 00 00 ff RSP: 0018:ffffc900003d7e50 EFLAGS: 00010202 RAX: ffff8881047fb500 RBX: 0000000000004e20 RCX: 0000000000000000 RDX: 000000000000000a RSI: 00000000fffffe00 RDI: 00000000ffffffff RBP: ffffffff8324dc00 R08: 0000000000000001 R09: 0000000000000001 R10: 0000000000000001 R11: 0000000000000001 R12: 0000000000000000 R13: 0000000000000001 R14: 0000000000004e20 R15: ffff8881054e1280 FS: 00007f8ac04dc740(0000) GS:ffff88842fc00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000020001540 CR3: 00000001055fa003 CR4: 0000000000770ef0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 PKRU: 55555554 Call Trace: <TASK> inet_csk_listen_start (net/ipv4/inet_connection_sock.c:1205) inet_listen (net/ipv4/af_inet.c:228) __sys_listen (net/socket.c:1810) __x64_sys_listen (net/socket.c:1819 net/socket.c:1817 net/socket.c:1817) do_syscall_64 (arch/x86/entry/common.c:50 arch/x86/entry/common.c:80) entry_SYSCALL_64_after_hwframe (arch/x86/entry/entry_64.S:120) RIP: 0033:0x7f8ac051de5d Code: ff c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 93 af 1b 00 f7 d8 64 89 01 48 RSP: 002b:00007ffc1c177248 EFLAGS: 00000206 ORIG_RAX: 0000000000000032 RAX: ffffffffffffffda RBX: 0000000020001550 RCX: 00007f8ac051de5d RDX: ffffffffffffff80 RSI: 0000000000000000 RDI: 0000000000000004 RBP: 00007ffc1c177270 R08: 0000000000000018 R09: 0000000000000007 R10: 0000000020001540 R11: 0000000000000206 R12: 00007ffc1c177388 R13: 0000000000401169 R14: 0000000000403e18 R15: 00007f8ac0723000 </TASK> Fixes: 28044fc ("net: Add a bhash2 table hashed by port and address") Reported-by: syzbot <syzkaller@googlegroups.com> Reported-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com> Acked-by: Joanne Koong <joannelkoong@gmail.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>

The coreboot_table driver registers a coreboot bus while probing a "coreboot_table" device representing the coreboot table memory region. Probing this device (i.e., registering the bus) is a dependency for the module_init() functions of any driver for this bus (e.g., memconsole-coreboot.c / memconsole_driver_init()). With synchronous probe, this dependency works OK, as the link order in the Makefile ensures coreboot_table_driver_init() (and thus, coreboot_table_probe()) completes before a coreboot device driver tries to add itself to the bus. With asynchronous probe, however, coreboot_table_probe() may race with memconsole_driver_init(), and so we're liable to hit one of these two: 1. coreboot_driver_register() eventually hits "[...] the bus was not initialized.", and the memconsole driver fails to register; or 2. coreboot_driver_register() gets past #1, but still races with bus_register() and hits some other undefined/crashing behavior (e.g., in driver_find() [1]) We can resolve this by registering the bus in our initcall, and only deferring "device" work (scanning the coreboot memory region and creating sub-devices) to probe(). [1] Example failure, using 'driver_async_probe=*' kernel command line: [ 0.114217] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000010 ... [ 0.114307] CPU: 1 PID: 1 Comm: swapper/0 Not tainted 6.1.0-rc1 #63 [ 0.114316] Hardware name: Google Scarlet (DT) ... [ 0.114488] Call trace: [ 0.114494] _raw_spin_lock+0x34/0x60 [ 0.114502] kset_find_obj+0x28/0x84 [ 0.114511] driver_find+0x30/0x50 [ 0.114520] driver_register+0x64/0x10c [ 0.114528] coreboot_driver_register+0x30/0x3c [ 0.114540] memconsole_driver_init+0x24/0x30 [ 0.114550] do_one_initcall+0x154/0x2e0 [ 0.114560] do_initcall_level+0x134/0x160 [ 0.114571] do_initcalls+0x60/0xa0 [ 0.114579] do_basic_setup+0x28/0x34 [ 0.114588] kernel_init_freeable+0xf8/0x150 [ 0.114596] kernel_init+0x2c/0x12c [ 0.114607] ret_from_fork+0x10/0x20 [ 0.114624] Code: 5280002b 1100054a b900092a f9800011 (885ffc01) [ 0.114631] ---[ end trace 0000000000000000 ]--- Fixes: b81e314 ("firmware: coreboot: Make bus registration symmetric") Cc: <stable@vger.kernel.org> Signed-off-by: Brian Norris <briannorris@chromium.org> Reviewed-by: Guenter Roeck <linux@roeck-us.net> Reviewed-by: Stephen Boyd <swboyd@chromium.org> Link: https://lore.kernel.org/r/20221019180934.1.If29e167d8a4771b0bf4a39c89c6946ed764817b9@changeid Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

matttbe added the bug label Jul 22, 2020

nrybowski mentioned this issue Jul 22, 2020

Kernel crash when using pm_nl_ctl and use_mptcp.sh #64

Closed

matttbe closed this as completed Jul 27, 2020

dcaratti pushed a commit to dcaratti/mptcp_net-next that referenced this issue Sep 2, 2021

Merge pull request multipath-tcp#63 from namjaejeon/cifsd-for-next

a8ab529

ksmbd-fixes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

New subflows send MP_JOIN to port 0 #63

New subflows send MP_JOIN to port 0 #63

matttbe commented Jul 22, 2020

nrybowski commented Jul 22, 2020

matttbe commented Jul 27, 2020

New subflows send MP_JOIN to port 0 #63

New subflows send MP_JOIN to port 0 #63

Comments

matttbe commented Jul 22, 2020

nrybowski commented Jul 22, 2020

matttbe commented Jul 27, 2020