Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sync up with Linus #28

Merged
merged 27 commits into from
Jan 29, 2015
Merged

Sync up with Linus #28

merged 27 commits into from
Jan 29, 2015

Conversation

dabrace
Copy link
Owner

@dabrace dabrace commented Jan 29, 2015

No description provided.

Aurelien BOUIN and others added 27 commits December 30, 2014 11:29
The xDC field should have 5 bit width according to Reference Manual.
Thus this patch fixes it.

Signed-off-by: Aurelien BOUIN <a_bouin@yahoo.fr>
Signed-off-by: Nicolin Chen <nicoleotsuka@gmail.com>
Signed-off-by: Mark Brown <broonie@kernel.org>
Cc: stable@vger.kernel.org
ts3a227e does not trigger interrupt to report jack status when system
boots from warm reset because ts3a227e's power remains on during warm reset.
Read jack status at probe to get current jack status.
Note that if system boots from EC reset, then this issue will not
happen.

Signed-off-by: Cheng-Yi Chiang <cychiang@chromium.org>
Signed-off-by: Mark Brown <broonie@kernel.org>
…cate

The offset of block allocator, ba->offset, should not be changed during
fixed address allocating, for the caller may treat it as the offset of
allocated memory and use it. In the case that we allocate more than 1
blocks, we should make sure this offset is correct.

Here introduces a temp allocator for the later continuous allocating.

Signed-off-by: Jie Yang <yang.jie@intel.com>
Signed-off-by: Mark Brown <broonie@kernel.org>
The probe routine was disabling the clock even
if the system was configured successfully.  Add
a return statement to leave clocks enabled.

Signed-off-by: Andrew Jackson <Andrew.Jackson@arm.com>
Acked-by: Lars-Peter Clausen <lars@metafoo.de>
Signed-off-by: Mark Brown <broonie@kernel.org>
wm8960 codec can't support sample rate 11250, it must be 11025.

Signed-off-by: Zidan Wang <b50113@freescale.com>
Acked-by: Charles Keepax <ckeepax@opensource.wolfsonmicro.com>
Signed-off-by: Mark Brown <broonie@kernel.org>
Cc: stable@vger.kernel.org
We should not send IPC stream commands to FW when the stream is
NULL, dereference the NULL pointer may also occur without precheck.
Here add NULL pointer checks for these stream APIs.

Signed-off-by: Jie Yang <yang.jie@intel.com>
Signed-off-by: Mark Brown <broonie@kernel.org>
This patch applys rate symmetry for rockchip i2s DAI.

Signed-off-by: Jianqun Xu <jay.xu@rock-chips.com>
Signed-off-by: Mark Brown <broonie@kernel.org>
Commit 2ffa531 ("ASoC: fsl_ssi: Fix module unbound") changed the way to
retrieve the irq number from irq_of_parse_and_map() to platform_get_irq(), but
missed to updated the irq error check accordingly.

We should test for negative irq number and propagate it in the case of error.

Signed-off-by: Fabio Estevam <fabio.estevam@freescale.com>
Signed-off-by: Mark Brown <broonie@kernel.org>
The patch modified the behavior that updates the PLL parameter. It set the
update bit before the PLL power up.

Signed-off-by: Oder Chiou <oder_chiou@realtek.com>
Signed-off-by: Mark Brown <broonie@kernel.org>
The DSP programs are listed out of order.

Signed-off-by: Peter Rosin <peda@axentia.se>
Signed-off-by: Mark Brown <broonie@kernel.org>
Cc: stable@vger.kernel.org
The following crash happens when trying to unload the snd_soc_imx_wm8962 module
while playback is active:

[  208.666868] Unable to handle kernel paging request at virtc
[  208.674110] pgd = 80004000
[  208.676867] [7f06541c] *pgd=4c334811, *pte=00000000, *ppte=00000000
[  208.683211] Internal error: Oops: 80000007 [#1] SMP ARM
[  208.688445] Modules linked in: snd_soc_wm8962 snd_soc_fsl_ssi snd_soc_imx_audmux imx_pcm_fiq evbug]
...

In order to avoid such problem, fill the card owner field as suggested by
Lars-Peter Clausen:

"But looking at the source it seems that this is a core feature of ALSA and at
least for the card module itself it will do the ref-counting when a stream is
started/stopped. And we even support setting the owner of a card in ASoC.
It's just that pretty much no ASoC card driver bothers to set the owner field
in the snd_soc_card struct. So this particular problem can be fixed by updating
the imx-wm8962 driver to set the owner field."

By doing as suggested, we no longer see the crash when attempting to unload the
snd_soc_imx_wm8962 module while playback is active:

$ modprobe -r snd_soc_imx_wm8962
modprobe: can't unload module snd_soc_imx_wm8962: Resource temporarily
unavailable

Reported-by: Jiada Wang <jiada_wang@mentor.com>
Suggested-by: Lars-Peter Clausen <lars@metafoo.de>
Signed-off-by: Fabio Estevam <fabio.estevam@freescale.com>
Signed-off-by: Mark Brown <broonie@kernel.org>
If asoc_simple_card_probe() fails, asoc_simple_card_unref() may be
called before dev_set_drvdata(), causing a NULL pointer dereference in
asoc_simple_card_unref():

    Unable to handle kernel NULL pointer dereference at virtual address 000000d4
    ...
    PC is at asoc_simple_card_unref+0x14/0x48
    LR is at asoc_simple_card_probe+0x3d4/0x40c

This typically happens because asoc_simple_card_parse_of() returns
-EPROBE_DEFER, but other failure modes are possible.
devm_snd_soc_register_card()/snd_soc_register_card() may fail either
before or after dev_set_drvdata().

Pass a snd_soc_card pointer instead of a platform_device pointer to
asoc_simple_card_unref() to fix this.

Note that if CONFIG_OF_DYNAMIC=n, of_node_put() is a dummy, and gcc may
optimize away the loop over card->dai_link, never actually dereferencing
card, and thus avoiding the crash...

Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Fixes: e512e00 ("ASoC: simple-card: Fix the reference count of device nodes")
Signed-off-by: Mark Brown <broonie@kernel.org>
Cc: stable@vger.kernel.org
Correct the path name for mux to get rid of the following warning:

--->8---
wm8904 1-001a: Control not supported for path ADCL -> [Left] -> AIFOUTL
wm8904 1-001a: ASoC: no dapm match for ADCL --> Left --> AIFOUTL
wm8904 1-001a: ASoC: Failed to add route ADCL -> Left -> AIFOUTL
wm8904 1-001a: Control not supported for path ADCR -> [Right] -> AIFOUTL
wm8904 1-001a: ASoC: no dapm match for ADCR --> Right --> AIFOUTL
wm8904 1-001a: ASoC: Failed to add route ADCR -> Right -> AIFOUTL
wm8904 1-001a: Control not supported for path ADCL -> [Left] -> AIFOUTR
wm8904 1-001a: ASoC: no dapm match for ADCL --> Left --> AIFOUTR
wm8904 1-001a: ASoC: Failed to add route ADCL -> Left -> AIFOUTR
wm8904 1-001a: Control not supported for path ADCR -> [Right] -> AIFOUTR
wm8904 1-001a: ASoC: no dapm match for ADCR --> Right --> AIFOUTR
wm8904 1-001a: ASoC: Failed to add route ADCR -> Right -> AIFOUTR
wm8904 1-001a: Control not supported for path AIFINR -> [Right] -> DACL
wm8904 1-001a: ASoC: no dapm match for AIFINR --> Right --> DACL
wm8904 1-001a: ASoC: Failed to add route AIFINR -> Right -> DACL
wm8904 1-001a: Control not supported for path AIFINL -> [Left] -> DACL
wm8904 1-001a: ASoC: no dapm match for AIFINL --> Left --> DACL
wm8904 1-001a: ASoC: Failed to add route AIFINL -> Left -> DACL
wm8904 1-001a: Control not supported for path AIFINR -> [Right] -> DACR
wm8904 1-001a: ASoC: no dapm match for AIFINR --> Right --> DACR
wm8904 1-001a: ASoC: Failed to add route AIFINR -> Right -> DACR
wm8904 1-001a: Control not supported for path AIFINL -> [Left] -> DACR
wm8904 1-001a: ASoC: no dapm match for AIFINL --> Left --> DACR
wm8904 1-001a: ASoC: Failed to add route AIFINL -> Left -> DACR
---8<---

Signed-off-by: Bo Shen <voice.shen@atmel.com>
Acked-by: Charles Keepax <ckeepax@opensource.wolfsonmicro.com>
Signed-off-by: Mark Brown <broonie@kernel.org>
There is only one I2S I/F, AD/DA path must operate to the same
format.

Signed-off-by: Bard Liao <bardliao@realtek.com>
Signed-off-by: Mark Brown <broonie@kernel.org>
In soc_new_compress() when rtd->dai_link->dynamic is set, we create the pcm
substreams with this call:

   ret = snd_pcm_new_internal(rtd->card->snd_card, new_name, num,
                                   1, 0, &be_pcm);

which passes 0 as capture_count leading to

   be_pcm->streams[SNDRV_PCM_STREAM_CAPTURE].substream

being NULL, hence when trying to set rtd a few lines below we get an oops.

Fix by using rtd->dai_link->dpcm_playback and rtd->dai_link->dpcm_capture as
playback_count and capture_count to snd_pcm_new_internal().

Signed-off-by: Qais Yousef <qais.yousef@imgtec.com>
Signed-off-by: Mark Brown <broonie@kernel.org>
Cc: stable@vger.kernel.org
We should select FSR also to be driven by McBSP, not only FSX.

Signed-off-by: Peter Ujfalusi <peter.ujfalusi@ti.com>
Acked-by: Jarkko Nikula <jarkko.nikula@bitmer.com>
Signed-off-by: Mark Brown <broonie@kernel.org>
Cc: stable@vger.kernel.org
Today we expect that all the bank are enabled, and count the number of banks
used by the pinctrl based on it instead of using the last bank id enabled.

So switch to it, set the chained IRQ at runtime based on enabled banks
and wait only the number of enabled gpio controllers at probe time.

Cc: <stable@vger.kernel.org> # 3.18
Signed-off-by: Jean-Christophe PLAGNIOL-VILLARD <plagnioj@jcrosoft.com>
Signed-off-by: Ludovic Desroches <ludovic.desroches@atmel.com>
Acked-by: Nicolas Ferre <nicolas.ferre@atmel.com>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
…asoc/fix/fsl-ssi', 'asoc/fix/imx', 'asoc/fix/intel', 'asoc/fix/omap', 'asoc/fix/rockchip' and 'asoc/fix/rt286' into asoc-linus
…'asoc/fix/ts3a227e', 'asoc/fix/wm8904' and 'asoc/fix/wm8960' into asoc-linus
…el/git/broonie/sound into for-linus

ASoC: Fixes for v3.19

There's a lot more fixes here than I'd like since I've been lax in
sending things throughout the release cycle but there's only two in
generic code and they've had quite some time to cook in -next so
hopefully on balance are OK.

The two generic fixes are a fix for crashes on capture DAIs in the
compress code and a fix for error handling on probe failures which would
have been harmless in the past but now oopses with the new dynamic probe
code.
When the last subscriber to a "Through" port has been removed, the
subscribed destination ports might still be active, so it would be
wrong to send "all sounds off" and "reset controller" events to them.
The proper place for such a shutdown would be the closing of the actual
MIDI port (and close_substream() in rawmidi.c already can do this).

This also fixes a deadlock when dummy_unuse() tries to send events to
its own port that is already locked because it is being freed.

Reported-by: Peter Billam <peter@www.pjb.com.au>
Cc: <stable@vger.kernel.org>
Signed-off-by: Clemens Ladisch <clemens@ladisch.de>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
The core VM already knows about VM_FAULT_SIGBUS, but cannot return a
"you should SIGSEGV" error, because the SIGSEGV case was generally
handled by the caller - usually the architecture fault handler.

That results in lots of duplication - all the architecture fault
handlers end up doing very similar "look up vma, check permissions, do
retries etc" - but it generally works.  However, there are cases where
the VM actually wants to SIGSEGV, and applications _expect_ SIGSEGV.

In particular, when accessing the stack guard page, libsigsegv expects a
SIGSEGV.  And it usually got one, because the stack growth is handled by
that duplicated architecture fault handler.

However, when the generic VM layer started propagating the error return
from the stack expansion in commit fee7e49 ("mm: propagate error
from stack expansion even for guard page"), that now exposed the
existing VM_FAULT_SIGBUS result to user space.  And user space really
expected SIGSEGV, not SIGBUS.

To fix that case, we need to add a VM_FAULT_SIGSEGV, and teach all those
duplicate architecture fault handlers about it.  They all already have
the code to handle SIGSEGV, so it's about just tying that new return
value to the existing code, but it's all a bit annoying.

This is the mindless minimal patch to do this.  A more extensive patch
would be to try to gather up the mostly shared fault handling logic into
one generic helper routine, and long-term we really should do that
cleanup.

Just from this patch, you can generally see that most architectures just
copied (directly or indirectly) the old x86 way of doing things, but in
the meantime that original x86 model has been improved to hold the VM
semaphore for shorter times etc and to handle VM_FAULT_RETRY and other
"newer" things, so it would be a good idea to bring all those
improvements to the generic case and teach other architectures about
them too.

Reported-and-tested-by: Takashi Iwai <tiwai@suse.de>
Tested-by: Jan Engelhardt <jengelh@inai.de>
Acked-by: Heiko Carstens <heiko.carstens@de.ibm.com> # "s390 still compiles and boots"
Cc: linux-arch@vger.kernel.org
Cc: stable@vger.kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
…SIGBUS

The stack guard page error case has long incorrectly caused a SIGBUS
rather than a SIGSEGV, but nobody actually noticed until commit
fee7e49 ("mm: propagate error from stack expansion even for guard
page") because that error case was never actually triggered in any
normal situations.

Now that we actually report the error, people noticed the wrong signal
that resulted.  So far, only the test suite of libsigsegv seems to have
actually cared, but there are real applications that use libsigsegv, so
let's not wait for any of those to break.

Reported-and-tested-by: Takashi Iwai <tiwai@suse.de>
Tested-by: Jan Engelhardt <jengelh@inai.de>
Acked-by: Heiko Carstens <heiko.carstens@de.ibm.com> # "s390 still compiles and boots"
Cc: linux-arch@vger.kernel.org
Cc: stable@vger.kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
…nel/git/linusw/linux-pinctrl

Pull final pin control fix from Linus Walleij:
 "A late pin control fix for the v3.19 series: The AT91 gpio controller
  would miss wakeup events, this single fix make it work properly"

[ "Final"? Yeah, I'll believe that once I've actually released 3.19 ;)   - Linus ]

* tag 'pinctrl-v3.19-4' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl:
  pinctrl: at91: allow to have disabled gpio bank
…el/git/tiwai/sound

Pull sound fixes from Takashi Iwai:
 "This batch ended up being larger than wished, but there is nothing to
  worry too much there.

  Most of commits are for ASoC, a compress NULL dereference fix, a fix
  for probe error handling, and the rest are device-specific fixes.  In
  addition, we have a fix for a long-standing but of seq-dummy driver,
  which just cuts off the buggy part in the end"

* tag 'sound-3.19-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound:
  ALSA: seq-dummy: remove deadlock-causing events on close
  ASoC: omap-mcbsp: Correct CBM_CFS dai format configuration
  ASoC: soc-compress.c: fix NULL dereference
  ASoC: rt286: set the same format for dac and adc
  ASoC: wm8904: fix runtime warning
  ASoC: simple-card: Fix crash in asoc_simple_card_unref()
  ASoC: fsl: imx-wm8962: Set the card owner field
  ASoC: pcm512x: Fix DSP program selection
  ASoC: rt5677: Modify the behavior that updates the PLL parameter.
  ASoC: fsl_ssi: Fix irq error check
  ASoC: rockchip: i2s: applys rate symmetry for CPU DAI
  ASoC: Intel: Add NULL checks for the stream pointer
  ASoC: wm8960: Fix capture sample rate from 11250 to 11025
  ASoC: adi: Add missing return statement.
  ASoC: Intel: Don't change offset of block allocator during fixed allocate
  ASoC: ts3a227e: Check and report jack status at probe
  ASoC: fsl_esai: Fix incorrect xDC field width of xCCR registers
dabrace added a commit that referenced this pull request Jan 29, 2015
@dabrace dabrace merged commit 510ab7d into dabrace:master Jan 29, 2015
dabrace pushed a commit that referenced this pull request Aug 5, 2015
Nikolay has reported a hang when a memcg reclaim got stuck with the
following backtrace:

PID: 18308  TASK: ffff883d7c9b0a30  CPU: 1   COMMAND: "rsync"
  #0 __schedule at ffffffff815ab152
  #1 schedule at ffffffff815ab76e
  #2 schedule_timeout at ffffffff815ae5e5
  #3 io_schedule_timeout at ffffffff815aad6a
  #4 bit_wait_io at ffffffff815abfc6
  #5 __wait_on_bit at ffffffff815abda5
  #6 wait_on_page_bit at ffffffff8111fd4f
  #7 shrink_page_list at ffffffff81135445
  #8 shrink_inactive_list at ffffffff81135845
  #9 shrink_lruvec at ffffffff81135ead
 #10 shrink_zone at ffffffff811360c3
 #11 shrink_zones at ffffffff81136eff
 #12 do_try_to_free_pages at ffffffff8113712f
 #13 try_to_free_mem_cgroup_pages at ffffffff811372be
 #14 try_charge at ffffffff81189423
 #15 mem_cgroup_try_charge at ffffffff8118c6f5
 #16 __add_to_page_cache_locked at ffffffff8112137d
 #17 add_to_page_cache_lru at ffffffff81121618
 #18 pagecache_get_page at ffffffff8112170b
 #19 grow_dev_page at ffffffff811c8297
 #20 __getblk_slow at ffffffff811c91d6
 #21 __getblk_gfp at ffffffff811c92c1
 #22 ext4_ext_grow_indepth at ffffffff8124565c
 #23 ext4_ext_create_new_leaf at ffffffff81246ca8
 #24 ext4_ext_insert_extent at ffffffff81246f09
 #25 ext4_ext_map_blocks at ffffffff8124a848
 #26 ext4_map_blocks at ffffffff8121a5b7
 #27 mpage_map_one_extent at ffffffff8121b1fa
 #28 mpage_map_and_submit_extent at ffffffff8121f07b
 #29 ext4_writepages at ffffffff8121f6d5
 #30 do_writepages at ffffffff8112c490
 #31 __filemap_fdatawrite_range at ffffffff81120199
 #32 filemap_flush at ffffffff8112041c
 #33 ext4_alloc_da_blocks at ffffffff81219da1
 #34 ext4_rename at ffffffff81229b91
 #35 ext4_rename2 at ffffffff81229e32
 #36 vfs_rename at ffffffff811a08a5
 #37 SYSC_renameat2 at ffffffff811a3ffc
 #38 sys_renameat2 at ffffffff811a408e
 #39 sys_rename at ffffffff8119e51e
 #40 system_call_fastpath at ffffffff815afa89

Dave Chinner has properly pointed out that this is a deadlock in the
reclaim code because ext4 doesn't submit pages which are marked by
PG_writeback right away.

The heuristic was introduced by commit e62e384 ("memcg: prevent OOM
with too many dirty pages") and it was applied only when may_enter_fs
was specified.  The code has been changed by c3b94f4 ("memcg:
further prevent OOM with too many dirty pages") which has removed the
__GFP_FS restriction with a reasoning that we do not get into the fs
code.  But this is not sufficient apparently because the fs doesn't
necessarily submit pages marked PG_writeback for IO right away.

ext4_bio_write_page calls io_submit_add_bh but that doesn't necessarily
submit the bio.  Instead it tries to map more pages into the bio and
mpage_map_one_extent might trigger memcg charge which might end up
waiting on a page which is marked PG_writeback but hasn't been submitted
yet so we would end up waiting for something that never finishes.

Fix this issue by replacing __GFP_IO by may_enter_fs check (for case 2)
before we go to wait on the writeback.  The page fault path, which is
the only path that triggers memcg oom killer since 3.12, shouldn't
require GFP_NOFS and so we shouldn't reintroduce the premature OOM
killer issue which was originally addressed by the heuristic.

As per David Chinner the xfs is doing similar thing since 2.6.15 already
so ext4 is not the only affected filesystem.  Moreover he notes:

: For example: IO completion might require unwritten extent conversion
: which executes filesystem transactions and GFP_NOFS allocations. The
: writeback flag on the pages can not be cleared until unwritten
: extent conversion completes. Hence memory reclaim cannot wait on
: page writeback to complete in GFP_NOFS context because it is not
: safe to do so, memcg reclaim or otherwise.

Cc: stable@vger.kernel.org # 3.9+
[tytso@mit.edu: corrected the control flow]
Fixes: c3b94f4 ("memcg: further prevent OOM with too many dirty pages")
Reported-by: Nikolay Borisov <kernel@kyup.com>
Signed-off-by: Michal Hocko <mhocko@suse.cz>
Signed-off-by: Hugh Dickins <hughd@google.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
dabrace pushed a commit that referenced this pull request Sep 18, 2015
We register wildcard mmio eventfd on two buses, once for KVM_MMIO_BUS
and once on KVM_FAST_MMIO_BUS but with a single iodev
instance. This will lead to an issue: kvm_io_bus_destroy() knows
nothing about the devices on two buses pointing to a single dev. Which
will lead to double free[1] during exit. Fix this by allocating two
instances of iodevs then registering one on KVM_MMIO_BUS and another
on KVM_FAST_MMIO_BUS.

CPU: 1 PID: 2894 Comm: qemu-system-x86 Not tainted 3.19.0-26-generic #28-Ubuntu
Hardware name: LENOVO 2356BG6/2356BG6, BIOS G7ET96WW (2.56 ) 09/12/2013
task: ffff88009ae0c4b0 ti: ffff88020e7f0000 task.ti: ffff88020e7f0000
RIP: 0010:[<ffffffffc07e25d8>]  [<ffffffffc07e25d8>] ioeventfd_release+0x28/0x60 [kvm]
RSP: 0018:ffff88020e7f3bc8  EFLAGS: 00010292
RAX: dead000000200200 RBX: ffff8801ec19c900 RCX: 000000018200016d
RDX: ffff8801ec19cf80 RSI: ffffea0008bf1d40 RDI: ffff8801ec19c900
RBP: ffff88020e7f3bd8 R08: 000000002fc75a01 R09: 000000018200016d
R10: ffffffffc07df6ae R11: ffff88022fc75a98 R12: ffff88021e7cc000
R13: ffff88021e7cca48 R14: ffff88021e7cca50 R15: ffff8801ec19c880
FS:  00007fc1ee3e6700(0000) GS:ffff88023e240000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007f8f389d8000 CR3: 000000023dc13000 CR4: 00000000001427e0
Stack:
ffff88021e7cc000 0000000000000000 ffff88020e7f3be8 ffffffffc07e2622
ffff88020e7f3c38 ffffffffc07df69a ffff880232524160 ffff88020e792d80
 0000000000000000 ffff880219b78c00 0000000000000008 ffff8802321686a8
Call Trace:
[<ffffffffc07e2622>] ioeventfd_destructor+0x12/0x20 [kvm]
[<ffffffffc07df69a>] kvm_put_kvm+0xca/0x210 [kvm]
[<ffffffffc07df818>] kvm_vcpu_release+0x18/0x20 [kvm]
[<ffffffff811f69f7>] __fput+0xe7/0x250
[<ffffffff811f6bae>] ____fput+0xe/0x10
[<ffffffff81093f04>] task_work_run+0xd4/0xf0
[<ffffffff81079358>] do_exit+0x368/0xa50
[<ffffffff81082c8f>] ? recalc_sigpending+0x1f/0x60
[<ffffffff81079ad5>] do_group_exit+0x45/0xb0
[<ffffffff81085c71>] get_signal+0x291/0x750
[<ffffffff810144d8>] do_signal+0x28/0xab0
[<ffffffff810f3a3b>] ? do_futex+0xdb/0x5d0
[<ffffffff810b7028>] ? __wake_up_locked_key+0x18/0x20
[<ffffffff810f3fa6>] ? SyS_futex+0x76/0x170
[<ffffffff81014fc9>] do_notify_resume+0x69/0xb0
[<ffffffff817cb9af>] int_signal+0x12/0x17
Code: 5d c3 90 0f 1f 44 00 00 55 48 89 e5 53 48 89 fb 48 83 ec 08 48 8b 7f 20 e8 06 d6 a5 c0 48 8b 43 08 48 8b 13 48 89 df 48 89 42 08 <48> 89 10 48 b8 00 01 10 00 00
 RIP  [<ffffffffc07e25d8>] ioeventfd_release+0x28/0x60 [kvm]
 RSP <ffff88020e7f3bc8>

Cc: stable@vger.kernel.org
Cc: Gleb Natapov <gleb@kernel.org>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
dabrace pushed a commit that referenced this pull request Oct 19, 2016
I observed false KSAN positives in the sctp code, when
sctp uses jprobe_return() in jsctp_sf_eat_sack().

The stray 0xf4 in shadow memory are stack redzones:

[     ] ==================================================================
[     ] BUG: KASAN: stack-out-of-bounds in memcmp+0xe9/0x150 at addr ffff88005e48f480
[     ] Read of size 1 by task syz-executor/18535
[     ] page:ffffea00017923c0 count:0 mapcount:0 mapping:          (null) index:0x0
[     ] flags: 0x1fffc0000000000()
[     ] page dumped because: kasan: bad access detected
[     ] CPU: 1 PID: 18535 Comm: syz-executor Not tainted 4.8.0+ #28
[     ] Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
[     ]  ffff88005e48f2d0 ffffffff82d2b849 ffffffff0bc91e90 fffffbfff10971e8
[     ]  ffffed000bc91e90 ffffed000bc91e90 0000000000000001 0000000000000000
[     ]  ffff88005e48f480 ffff88005e48f350 ffffffff817d3169 ffff88005e48f370
[     ] Call Trace:
[     ]  [<ffffffff82d2b849>] dump_stack+0x12e/0x185
[     ]  [<ffffffff817d3169>] kasan_report+0x489/0x4b0
[     ]  [<ffffffff817d31a9>] __asan_report_load1_noabort+0x19/0x20
[     ]  [<ffffffff82d49529>] memcmp+0xe9/0x150
[     ]  [<ffffffff82df7486>] depot_save_stack+0x176/0x5c0
[     ]  [<ffffffff817d2031>] save_stack+0xb1/0xd0
[     ]  [<ffffffff817d27f2>] kasan_slab_free+0x72/0xc0
[     ]  [<ffffffff817d05b8>] kfree+0xc8/0x2a0
[     ]  [<ffffffff85b03f19>] skb_free_head+0x79/0xb0
[     ]  [<ffffffff85b0900a>] skb_release_data+0x37a/0x420
[     ]  [<ffffffff85b090ff>] skb_release_all+0x4f/0x60
[     ]  [<ffffffff85b11348>] consume_skb+0x138/0x370
[     ]  [<ffffffff8676ad7b>] sctp_chunk_put+0xcb/0x180
[     ]  [<ffffffff8676ae88>] sctp_chunk_free+0x58/0x70
[     ]  [<ffffffff8677fa5f>] sctp_inq_pop+0x68f/0xef0
[     ]  [<ffffffff8675ee36>] sctp_assoc_bh_rcv+0xd6/0x4b0
[     ]  [<ffffffff8677f2c1>] sctp_inq_push+0x131/0x190
[     ]  [<ffffffff867bad69>] sctp_backlog_rcv+0xe9/0xa20
[ ... ]
[     ] Memory state around the buggy address:
[     ]  ffff88005e48f380: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[     ]  ffff88005e48f400: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[     ] >ffff88005e48f480: f4 f4 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[     ]                    ^
[     ]  ffff88005e48f500: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[     ]  ffff88005e48f580: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[     ] ==================================================================

KASAN stack instrumentation poisons stack redzones on function entry
and unpoisons them on function exit. If a function exits abnormally
(e.g. with a longjmp like jprobe_return()), stack redzones are left
poisoned. Later this leads to random KASAN false reports.

Unpoison stack redzones in the frames we are going to jump over
before doing actual longjmp in jprobe_return().

Signed-off-by: Dmitry Vyukov <dvyukov@google.com>
Acked-by: Masami Hiramatsu <mhiramat@kernel.org>
Reviewed-by: Mark Rutland <mark.rutland@arm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com>
Cc: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Ananth N Mavinakayanahalli <ananth@linux.vnet.ibm.com>
Cc: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: kasan-dev@googlegroups.com
Cc: surovegin@google.com
Cc: rostedt@goodmis.org
Link: http://lkml.kernel.org/r/1476454043-101898-1-git-send-email-dvyukov@google.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
dabrace pushed a commit that referenced this pull request Dec 12, 2016
I get the following UBSAN warning during boot on my laptop:

================================================================================
UBSAN: Undefined behaviour in drivers/net/wireless/broadcom/brcm80211/brcmsmac/phy/phy_qmath.c:280:21
index 32 is out of range for type 's16 [32]'
CPU: 0 PID: 879 Comm: NetworkManager Not tainted 4.9.0-rc4 #28
Hardware name: LENOVO Lenovo IdeaPad N581/INVALID, BIOS 5ECN96WW(V9.01) 03/14/2013
ffff8800b74a6478 ffffffff828e59d2 0000000041b58ab3 ffffffff8398330c
ffffffff828e5920 ffff8800b74a64a0 ffff8800b74a6450 0000000000000020
1ffffffff845848c ffffed0016e94bf1 ffffffffc22c2460 000000006b9c0514
Call Trace:
[<ffffffff828e59d2>] dump_stack+0xb2/0x110
[<ffffffff828e5920>] ? _atomic_dec_and_lock+0x150/0x150
[<ffffffff82968c9d>] ubsan_epilogue+0xd/0x4e
[<ffffffff82969875>] __ubsan_handle_out_of_bounds+0xfa/0x13e
[<ffffffff8296977b>] ? __ubsan_handle_shift_out_of_bounds+0x241/0x241
[<ffffffffc0d48379>] ? bcma_host_pci_read16+0x59/0xa0 [bcma]
[<ffffffffc0d48388>] ? bcma_host_pci_read16+0x68/0xa0 [bcma]
[<ffffffffc212ad78>] ? read_phy_reg+0xe8/0x180 [brcmsmac]
[<ffffffffc2184714>] qm_log10+0x2e4/0x350 [brcmsmac]
[<ffffffffc2142eb8>] wlc_phy_init_lcnphy+0x538/0x1f20 [brcmsmac]
[<ffffffffc2142980>] ? wlc_lcnphy_periodic_cal+0x5c0/0x5c0 [brcmsmac]
[<ffffffffc1ba0c93>] ? ieee80211_open+0xb3/0x110 [mac80211]
[<ffffffff82f73a02>] ? sk_busy_loop+0x1e2/0x840
[<ffffffff82f7a6ce>] ? __dev_change_flags+0xae/0x220
...

The report is valid: doing the math in this function, with an input value
N=63 the variable s16tableIndex gets a value of 31. This value is used as
an index in the array log_table with 32 entries. But the next line is:

	s16errorApproximation = (s16) qm_mulu16(u16offset,
				(u16) (log_table[s16tableIndex + 1] -
				       log_table[s16tableIndex]));

With s16tableIndex + 1 we are trying an out-of-bounds access to the array.

The log_table array provides log2 values in q.15 format and the above
statement tries an error approximation with the next value. To fix this
issue add the next value to the array and update the comment accordingly.

Signed-off-by: Tobias Regnery <tobias.regnery@gmail.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
dabrace pushed a commit that referenced this pull request Mar 15, 2018
A tty is hung up by __tty_hangup() setting file->f_op to
hung_up_tty_fops, which is skipped on ttys whose write operation isn't
tty_write().  This means that, for example, /dev/console whose write
op is redirected_tty_write() is never actually marked hung up.

Because n_tty_read() uses the hung up status to decide whether to
abort the waiting readers, the lack of hung-up marking can lead to the
following scenario.

 1. A session contains two processes.  The leader and its child.  The
    child ignores SIGHUP.

 2. The leader exits and starts disassociating from the controlling
    terminal (/dev/console).

 3. __tty_hangup() skips setting f_op to hung_up_tty_fops.

 4. SIGHUP is delivered and ignored.

 5. tty_ldisc_hangup() is invoked.  It wakes up the waits which should
    clear the read lockers of tty->ldisc_sem.

 6. The reader wakes up but because tty_hung_up_p() is false, it
    doesn't abort and goes back to sleep while read-holding
    tty->ldisc_sem.

 7. The leader progresses to tty_ldisc_lock() in tty_ldisc_hangup()
    and is now stuck in D sleep indefinitely waiting for
    tty->ldisc_sem.

The following is Alan's explanation on why some ttys aren't hung up.

 http://lkml.kernel.org/r/20171101170908.6ad08580@alans-desktop

 1. It broke the serial consoles because they would hang up and close
    down the hardware. With tty_port that *should* be fixable properly
    for any cases remaining.

 2. The console layer was (and still is) completely broken and doens't
    refcount properly. So if you turn on console hangups it breaks (as
    indeed does freeing consoles and half a dozen other things).

As neither can be fixed quickly, this patch works around the problem
by introducing a new flag, TTY_HUPPING, which is used solely to tell
n_tty_read() that hang-up is in progress for the console and the
readers should be aborted regardless of the hung-up status of the
device.

The following is a sample hung task warning caused by this issue.

  INFO: task agetty:2662 blocked for more than 120 seconds.
        Not tainted 4.11.3-dbg-tty-lockup-02478-gfd6c7ee-dirty #28
  "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
      0  2662      1 0x00000086
  Call Trace:
   __schedule+0x267/0x890
   schedule+0x36/0x80
   schedule_timeout+0x23c/0x2e0
   ldsem_down_write+0xce/0x1f6
   tty_ldisc_lock+0x16/0x30
   tty_ldisc_hangup+0xb3/0x1b0
   __tty_hangup+0x300/0x410
   disassociate_ctty+0x6c/0x290
   do_exit+0x7ef/0xb00
   do_group_exit+0x3f/0xa0
   get_signal+0x1b3/0x5d0
   do_signal+0x28/0x660
   exit_to_usermode_loop+0x46/0x86
   do_syscall_64+0x9c/0xb0
   entry_SYSCALL64_slow_path+0x25/0x25

The following is the repro.  Run "$PROG /dev/console".  The parent
process hangs in D state.

  #include <sys/types.h>
  #include <sys/stat.h>
  #include <sys/wait.h>
  #include <sys/ioctl.h>
  #include <fcntl.h>
  #include <unistd.h>
  #include <stdio.h>
  #include <stdlib.h>
  #include <errno.h>
  #include <signal.h>
  #include <time.h>
  #include <termios.h>

  int main(int argc, char **argv)
  {
	  struct sigaction sact = { .sa_handler = SIG_IGN };
	  struct timespec ts1s = { .tv_sec = 1 };
	  pid_t pid;
	  int fd;

	  if (argc < 2) {
		  fprintf(stderr, "test-hung-tty /dev/$TTY\n");
		  return 1;
	  }

	  /* fork a child to ensure that it isn't already the session leader */
	  pid = fork();
	  if (pid < 0) {
		  perror("fork");
		  return 1;
	  }

	  if (pid > 0) {
		  /* top parent, wait for everyone */
		  while (waitpid(-1, NULL, 0) >= 0)
			  ;
		  if (errno != ECHILD)
			  perror("waitpid");
		  return 0;
	  }

	  /* new session, start a new session and set the controlling tty */
	  if (setsid() < 0) {
		  perror("setsid");
		  return 1;
	  }

	  fd = open(argv[1], O_RDWR);
	  if (fd < 0) {
		  perror("open");
		  return 1;
	  }

	  if (ioctl(fd, TIOCSCTTY, 1) < 0) {
		  perror("ioctl");
		  return 1;
	  }

	  /* fork a child, sleep a bit and exit */
	  pid = fork();
	  if (pid < 0) {
		  perror("fork");
		  return 1;
	  }

	  if (pid > 0) {
		  nanosleep(&ts1s, NULL);
		  printf("Session leader exiting\n");
		  exit(0);
	  }

	  /*
	   * The child ignores SIGHUP and keeps reading from the controlling
	   * tty.  Because SIGHUP is ignored, the child doesn't get killed on
	   * parent exit and the bug in n_tty makes the read(2) block the
	   * parent's control terminal hangup attempt.  The parent ends up in
	   * D sleep until the child is explicitly killed.
	   */
	  sigaction(SIGHUP, &sact, NULL);
	  printf("Child reading tty\n");
	  while (1) {
		  char buf[1024];

		  if (read(fd, buf, sizeof(buf)) < 0) {
			  perror("read");
			  return 1;
		  }
	  }

	  return 0;
  }

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Alan Cox <alan@llwyncelyn.cymru>
Cc: stable@vger.kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
dabrace pushed a commit that referenced this pull request May 21, 2018
syzbot caught an infinite recursion in nsh_gso_segment().

Problem here is that we need to make sure the NSH header is of
reasonable length.

BUG: MAX_LOCK_DEPTH too low!
turning off the locking correctness validator.
depth: 48  max: 48!
48 locks held by syz-executor0/10189:
 #0:         (ptrval) (rcu_read_lock_bh){....}, at: __dev_queue_xmit+0x30f/0x34c0 net/core/dev.c:3517
 #1:         (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline]
 #1:         (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787
 #2:         (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline]
 #2:         (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787
 #3:         (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline]
 #3:         (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787
 #4:         (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline]
 #4:         (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787
 #5:         (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline]
 #5:         (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787
 #6:         (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline]
 #6:         (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787
 #7:         (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline]
 #7:         (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787
 #8:         (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline]
 #8:         (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787
 #9:         (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline]
 #9:         (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787
 #10:         (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline]
 #10:         (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787
 #11:         (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline]
 #11:         (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787
 #12:         (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline]
 #12:         (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787
 #13:         (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline]
 #13:         (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787
 #14:         (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline]
 #14:         (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787
 #15:         (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline]
 #15:         (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787
 #16:         (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline]
 #16:         (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787
 #17:         (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline]
 #17:         (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787
 #18:         (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline]
 #18:         (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787
 #19:         (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline]
 #19:         (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787
 #20:         (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline]
 #20:         (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787
 #21:         (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline]
 #21:         (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787
 #22:         (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline]
 #22:         (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787
 #23:         (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline]
 #23:         (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787
 #24:         (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline]
 #24:         (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787
 #25:         (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline]
 #25:         (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787
 #26:         (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline]
 #26:         (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787
 #27:         (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline]
 #27:         (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787
 #28:         (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline]
 #28:         (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787
 #29:         (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline]
 #29:         (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787
 #30:         (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline]
 #30:         (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787
 #31:         (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline]
 #31:         (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787
dccp_close: ABORT with 65423 bytes unread
 #32:         (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline]
 #32:         (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787
 #33:         (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline]
 #33:         (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787
 #34:         (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline]
 #34:         (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787
 #35:         (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline]
 #35:         (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787
 #36:         (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline]
 #36:         (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787
 #37:         (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline]
 #37:         (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787
 #38:         (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline]
 #38:         (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787
 #39:         (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline]
 #39:         (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787
 #40:         (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline]
 #40:         (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787
 #41:         (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline]
 #41:         (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787
 #42:         (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline]
 #42:         (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787
 #43:         (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline]
 #43:         (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787
 #44:         (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline]
 #44:         (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787
 #45:         (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline]
 #45:         (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787
 #46:         (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline]
 #46:         (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787
 #47:         (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline]
 #47:         (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787
INFO: lockdep is turned off.
CPU: 1 PID: 10189 Comm: syz-executor0 Not tainted 4.17.0-rc2+ #26
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Call Trace:
 __dump_stack lib/dump_stack.c:77 [inline]
 dump_stack+0x1b9/0x294 lib/dump_stack.c:113
 __lock_acquire+0x1788/0x5140 kernel/locking/lockdep.c:3449
 lock_acquire+0x1dc/0x520 kernel/locking/lockdep.c:3920
 rcu_lock_acquire include/linux/rcupdate.h:246 [inline]
 rcu_read_lock include/linux/rcupdate.h:632 [inline]
 skb_mac_gso_segment+0x25b/0x720 net/core/dev.c:2789
 nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107
 skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792
 nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107
 skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792
 nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107
 skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792
 nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107
 skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792
 nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107
 skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792
 nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107
 skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792
 nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107
 skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792
 nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107
 skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792
 nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107
 skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792
 nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107
 skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792
 nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107
 skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792
 nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107
 skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792
 nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107
 skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792
 nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107
 skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792
 nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107
 skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792
 nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107
 skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792
 nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107
 skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792
 nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107
 skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792
 nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107
 skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792
 nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107
 skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792
 nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107
 skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792
 nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107
 skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792
 nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107
 skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792
 nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107
 skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792
 nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107
 skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792
 nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107
 skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792
 nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107
 skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792
 nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107
 skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792
 nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107
 skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792
 nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107
 skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792
 nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107
 skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792
 nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107
 skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792
 nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107
 skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792
 nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107
 skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792
 nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107
 skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792
 nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107
 skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792
 nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107
 skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792
 nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107
 skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792
 nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107
 skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792
 nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107
 skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792
 nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107
 skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792
 nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107
 skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792
 nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107
 skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792
 nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107
 skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792
 nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107
 skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792
 nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107
 skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792
 __skb_gso_segment+0x3bb/0x870 net/core/dev.c:2865
 skb_gso_segment include/linux/netdevice.h:4025 [inline]
 validate_xmit_skb+0x54d/0xd90 net/core/dev.c:3118
 validate_xmit_skb_list+0xbf/0x120 net/core/dev.c:3168
 sch_direct_xmit+0x354/0x11e0 net/sched/sch_generic.c:312
 qdisc_restart net/sched/sch_generic.c:399 [inline]
 __qdisc_run+0x741/0x1af0 net/sched/sch_generic.c:410
 __dev_xmit_skb net/core/dev.c:3243 [inline]
 __dev_queue_xmit+0x28ea/0x34c0 net/core/dev.c:3551
 dev_queue_xmit+0x17/0x20 net/core/dev.c:3616
 packet_snd net/packet/af_packet.c:2951 [inline]
 packet_sendmsg+0x40f8/0x6070 net/packet/af_packet.c:2976
 sock_sendmsg_nosec net/socket.c:629 [inline]
 sock_sendmsg+0xd5/0x120 net/socket.c:639
 __sys_sendto+0x3d7/0x670 net/socket.c:1789
 __do_sys_sendto net/socket.c:1801 [inline]
 __se_sys_sendto net/socket.c:1797 [inline]
 __x64_sys_sendto+0xe1/0x1a0 net/socket.c:1797
 do_syscall_64+0x1b1/0x800 arch/x86/entry/common.c:287
 entry_SYSCALL_64_after_hwframe+0x49/0xbe

Fixes: c411ed8 ("nsh: add GSO support")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Jiri Benc <jbenc@redhat.com>
Reported-by: syzbot <syzkaller@googlegroups.com>
Acked-by: Jiri Benc <jbenc@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
dabrace pushed a commit that referenced this pull request Jul 24, 2018
Crash dump shows following instructions

crash> bt
PID: 0      TASK: ffffffffbe412480  CPU: 0   COMMAND: "swapper/0"
 #0 [ffff891ee0003868] machine_kexec at ffffffffbd063ef1
 #1 [ffff891ee00038c8] __crash_kexec at ffffffffbd12b6f2
 #2 [ffff891ee0003998] crash_kexec at ffffffffbd12c84c
 #3 [ffff891ee00039b8] oops_end at ffffffffbd030f0a
 #4 [ffff891ee00039e0] no_context at ffffffffbd074643
 #5 [ffff891ee0003a40] __bad_area_nosemaphore at ffffffffbd07496e
 #6 [ffff891ee0003a90] bad_area_nosemaphore at ffffffffbd074a64
 #7 [ffff891ee0003aa0] __do_page_fault at ffffffffbd074b0a
 #8 [ffff891ee0003b18] do_page_fault at ffffffffbd074fc8
 #9 [ffff891ee0003b50] page_fault at ffffffffbda01925
    [exception RIP: qlt_schedule_sess_for_deletion+15]
    RIP: ffffffffc02e526f  RSP: ffff891ee0003c08  RFLAGS: 00010046
    RAX: 0000000000000000  RBX: 0000000000000000  RCX: ffffffffc0307847
    RDX: 00000000000020e6  RSI: ffff891edbc377c8  RDI: 0000000000000000
    RBP: ffff891ee0003c18   R8: ffffffffc02f0b20   R9: 0000000000000250
    R10: 0000000000000258  R11: 000000000000b780  R12: ffff891ed9b43000
    R13: 00000000000000f0  R14: 0000000000000006  R15: ffff891edbc377c8
    ORIG_RAX: ffffffffffffffff  CS: 0010  SS: 0018
 #10 [ffff891ee0003c20] qla2x00_fcport_event_handler at ffffffffc02853d3 [qla2xxx]
 #11 [ffff891ee0003cf0] __dta_qla24xx_async_gnl_sp_done_333 at ffffffffc0285a1d [qla2xxx]
 #12 [ffff891ee0003de8] qla24xx_process_response_queue at ffffffffc02a2eb5 [qla2xxx]
 #13 [ffff891ee0003e88] qla24xx_msix_rsp_q at ffffffffc02a5403 [qla2xxx]
 #14 [ffff891ee0003ec0] __handle_irq_event_percpu at ffffffffbd0f4c59
 #15 [ffff891ee0003f10] handle_irq_event_percpu at ffffffffbd0f4e02
 #16 [ffff891ee0003f40] handle_irq_event at ffffffffbd0f4e90
 #17 [ffff891ee0003f68] handle_edge_irq at ffffffffbd0f8984
 #18 [ffff891ee0003f88] handle_irq at ffffffffbd0305d5
 #19 [ffff891ee0003fb8] do_IRQ at ffffffffbda02a18
 --- <IRQ stack> ---
 #20 [ffffffffbe403d30] ret_from_intr at ffffffffbda0094e
    [exception RIP: unknown or invalid address]
    RIP: 000000000000001f  RSP: 0000000000000000  RFLAGS: fff3b8c2091ebb3f
    RAX: ffffbba5a0000200  RBX: 0000be8cdfa8f9fa  RCX: 0000000000000018
    RDX: 0000000000000101  RSI: 000000000000015d  RDI: 0000000000000193
    RBP: 0000000000000083   R8: ffffffffbe403e38   R9: 0000000000000002
    R10: 0000000000000000  R11: ffffffffbe56b820  R12: ffff891ee001cf00
    R13: ffffffffbd11c0a4  R14: ffffffffbe403d60  R15: 0000000000000001
    ORIG_RAX: ffff891ee0022ac0  CS: 0000  SS: ffffffffffffffb9
 bt: WARNING: possibly bogus exception frame
 #21 [ffffffffbe403dd8] cpuidle_enter_state at ffffffffbd67c6fd
 #22 [ffffffffbe403e40] cpuidle_enter at ffffffffbd67c907
 #23 [ffffffffbe403e50] call_cpuidle at ffffffffbd0d98f3
 #24 [ffffffffbe403e60] do_idle at ffffffffbd0d9b42
 #25 [ffffffffbe403e98] cpu_startup_entry at ffffffffbd0d9da3
 #26 [ffffffffbe403ec0] rest_init at ffffffffbd81d4aa
 #27 [ffffffffbe403ed0] start_kernel at ffffffffbe67d2ca
 #28 [ffffffffbe403f28] x86_64_start_reservations at ffffffffbe67c675
 #29 [ffffffffbe403f38] x86_64_start_kernel at ffffffffbe67c6eb
 #30 [ffffffffbe403f50] secondary_startup_64 at ffffffffbd0000d5

Fixes: 040036b ("scsi: qla2xxx: Delay loop id allocation at login")
Cc: <stable@vger.kernel.org> # v4.17+
Signed-off-by: Chuck Anderson <chuck.anderson@oracle.com>
Signed-off-by: Himanshu Madhani <himanshu.madhani@cavium.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
dabrace pushed a commit that referenced this pull request Aug 17, 2018
If the method fails after calling rdma_explicit_destroy (eg if
copy_to_user faults) then it will trigger a kernel oops:

BUG: unable to handle kernel NULL pointer dereference at 0000000000000000
PGD 800000000548d067 P4D 800000000548d067 PUD 54a0067 PMD 0
SMP PTI
CPU: 0 PID: 359 Comm: ibv_rc_pingpong Not tainted 4.18.0-rc1+ #28
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.7.5-0-ge51488c-20140602_164612-nilsson.home.kraxel.org 04/01/2014
RIP: 0010:          (null)
Code: Bad RIP value.
RSP: 0018:ffffc900001a3bf0 EFLAGS: 00010246
RAX: 0000000000000000 RBX: ffff88000603bd00 RCX: 0000000000000003
RDX: 0000000000000001 RSI: 0000000000000001 RDI: ffff88000603bd00
RBP: 0000000000000001 R08: ffffc900001a3cf8 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000000 R12: ffffc900001a3cf0
R13: 0000000000000000 R14: ffffc900001a3cf0 R15: 0000000000000000
FS:  00007fb00dda8700(0000) GS:ffff880007c00000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: ffffffffffffffd6 CR3: 000000000548e004 CR4: 00000000003606b0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
 ? rdma_lookup_put_uobject+0x22/0x50 [ib_uverbs]
 ? uverbs_finalize_object+0x3b/0x60 [ib_uverbs]
 ? uverbs_finalize_attrs+0x128/0x140 [ib_uverbs]
 ? ib_uverbs_cmd_verbs+0x698/0x7c0 [ib_uverbs]
 ? find_held_lock+0x2d/0x90
 ? __might_fault+0x39/0x90
 ? ib_uverbs_ioctl+0x111/0x1f0 [ib_uverbs]
 ? do_vfs_ioctl+0xa0/0x6d0
 ? trace_hardirqs_on_caller+0xed/0x180
 ? _raw_spin_unlock_irq+0x24/0x40
 ? syscall_trace_enter+0x138/0x1d0
 ? ksys_ioctl+0x35/0x60
 ? __x64_sys_ioctl+0x11/0x20
 ? do_syscall_64+0x5b/0x1c0
 ? entry_SYSCALL_64_after_hwframe+0x49/0xbe

This is because the type was replaced with the null_type during explicit
destroy that cannot complete the destruction.

One of the side effects of replacing the type is to make the object
handle totally unreachable - so no other command could attempt to use
it, even though it remains on the uboject list.

We can get the same end result by just fully destroying the object inside
rdma_explicit_destroy and leaving the caller the residual kref for the
uobj with no attached HW object, and no presence in the ubojects list.

Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Reviewed-by: Leon Romanovsky <leonro@mellanox.com>
dabrace pushed a commit that referenced this pull request Oct 2, 2018
Applying dynamic usbcore quirks in early booting when the slab is
not yet ready would cause kernel panic of null pointer dereference
because the quirk_count has been counted as 1 while the quirk_list
was failed to allocate.

i.e.,
[    1.044970] BUG: unable to handle kernel NULL pointer dereference at           (null)
[    1.044995] IP: [<ffffffffb0953ec7>] usb_detect_quirks+0x88/0xd1
[    1.045016] PGD 0
[    1.045026] Oops: 0000 [#1] PREEMPT SMP
[    1.046986] gsmi: Log Shutdown Reason 0x03
[    1.046995] Modules linked in:
[    1.047008] CPU: 0 PID: 81 Comm: kworker/0:3 Not tainted 4.4.154 #28
[    1.047016] Hardware name: Google Coral/Coral, BIOS Google_Coral.10068.27.0 12/04/2017
[    1.047028] Workqueue: usb_hub_wq hub_event
[    1.047037] task: ffff88017a321c80 task.stack: ffff88017a384000
[    1.047044] RIP: 0010:[<ffffffffb0953ec7>]  [<ffffffffb0953ec7>] usb_detect_quirks+0x88/0xd1

To tackle this odd, let's balance the quirk_count to 0 when the kcalloc
call fails, and defer the quirk setting into a lower level callback
which ensures that the kernel memory management has been initialized.

Fixes: 027bd6c ("usb: core: Add "quirks" parameter for usbcore")
Signed-off-by: Harry Pan <harry.pan@intel.com>
Acked-by: Kai-Heng Feng <kai.heng.feng@canonical.com>
Cc: stable <stable@vger.kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
dabrace pushed a commit that referenced this pull request Nov 12, 2018
Increase kasan instrumented kernel stack size from 32k to 64k. Other
architectures seems to get away with just doubling kernel stack size under
kasan, but on s390 this appears to be not enough due to bigger frame size.
The particular pain point is kasan inlined checks (CONFIG_KASAN_INLINE
vs CONFIG_KASAN_OUTLINE). With inlined checks one particular case hitting
stack overflow is fs sync on xfs filesystem:

 #0 [9a0681e8]  704 bytes  check_usage at 34b1fc
 #1 [9a0684a8]  432 bytes  check_usage at 34c710
 #2 [9a068658]  1048 bytes  validate_chain at 35044a
 #3 [9a068a70]  312 bytes  __lock_acquire at 3559fe
 #4 [9a068ba8]  440 bytes  lock_acquire at 3576ee
 #5 [9a068d60]  104 bytes  _raw_spin_lock at 21b44e0
 #6 [9a068dc8]  1992 bytes  enqueue_entity at 2dbf72
 #7 [9a069590]  1496 bytes  enqueue_task_fair at 2df5f0
 #8 [9a069b68]  64 bytes  ttwu_do_activate at 28f438
 #9 [9a069ba8]  552 bytes  try_to_wake_up at 298c4c
 #10 [9a069dd0]  168 bytes  wake_up_worker at 23f97c
 #11 [9a069e78]  200 bytes  insert_work at 23fc2e
 #12 [9a069f40]  648 bytes  __queue_work at 2487c0
 #13 [9a06a1c8]  200 bytes  __queue_delayed_work at 24db28
 #14 [9a06a290]  248 bytes  mod_delayed_work_on at 24de84
 #15 [9a06a388]  24 bytes  kblockd_mod_delayed_work_on at 153e2a0
 #16 [9a06a3a0]  288 bytes  __blk_mq_delay_run_hw_queue at 158168c
 #17 [9a06a4c0]  192 bytes  blk_mq_run_hw_queue at 1581a3c
 #18 [9a06a580]  184 bytes  blk_mq_sched_insert_requests at 15a2192
 #19 [9a06a638]  1024 bytes  blk_mq_flush_plug_list at 1590f3a
 #20 [9a06aa38]  704 bytes  blk_flush_plug_list at 1555028
 #21 [9a06acf8]  320 bytes  schedule at 219e476
 #22 [9a06ae38]  760 bytes  schedule_timeout at 21b0aac
 #23 [9a06b130]  408 bytes  wait_for_common at 21a1706
 #24 [9a06b2c8]  360 bytes  xfs_buf_iowait at fa1540
 #25 [9a06b430]  256 bytes  __xfs_buf_submit at fadae6
 #26 [9a06b530]  264 bytes  xfs_buf_read_map at fae3f6
 #27 [9a06b638]  656 bytes  xfs_trans_read_buf_map at 10ac9a8
 #28 [9a06b8c8]  304 bytes  xfs_btree_kill_root at e72426
 #29 [9a06b9f8]  288 bytes  xfs_btree_lookup_get_block at e7bc5e
 #30 [9a06bb18]  624 bytes  xfs_btree_lookup at e7e1a6
 #31 [9a06bd88]  2664 bytes  xfs_alloc_ag_vextent_near at dfa070
 #32 [9a06c7f0]  144 bytes  xfs_alloc_ag_vextent at dff3ca
 #33 [9a06c880]  1128 bytes  xfs_alloc_vextent at e05fce
 #34 [9a06cce8]  584 bytes  xfs_bmap_btalloc at e58342
 #35 [9a06cf30]  1336 bytes  xfs_bmapi_write at e618de
 #36 [9a06d468]  776 bytes  xfs_iomap_write_allocate at ff678e
 #37 [9a06d770]  720 bytes  xfs_map_blocks at f82af8
 #38 [9a06da40]  928 bytes  xfs_writepage_map at f83cd6
 #39 [9a06dde0]  320 bytes  xfs_do_writepage at f85872
 #40 [9a06df20]  1320 bytes  write_cache_pages at 73dfe8
 #41 [9a06e448]  208 bytes  xfs_vm_writepages at f7f892
 #42 [9a06e518]  88 bytes  do_writepages at 73fe6a
 #43 [9a06e570]  872 bytes  __writeback_single_inode at a20cb6
 #44 [9a06e8d8]  664 bytes  writeback_sb_inodes at a23be2
 #45 [9a06eb70]  296 bytes  __writeback_inodes_wb at a242e0
 #46 [9a06ec98]  928 bytes  wb_writeback at a2500e
 #47 [9a06f038]  848 bytes  wb_do_writeback at a260ae
 #48 [9a06f388]  536 bytes  wb_workfn at a28228
 #49 [9a06f5a0]  1088 bytes  process_one_work at 24a234
 #50 [9a06f9e0]  1120 bytes  worker_thread at 24ba26
 #51 [9a06fe40]  104 bytes  kthread at 26545a
 #52 [9a06fea8]             kernel_thread_starter at 21b6b62

To be able to increase the stack size to 64k reuse LLILL instruction
in __switch_to function to load 64k - STACK_FRAME_OVERHEAD - __PT_SIZE
(65192) value as unsigned.

Reported-by: Benjamin Block <bblock@linux.ibm.com>
Reviewed-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
dabrace pushed a commit that referenced this pull request Jun 3, 2019
If alloc_workqueue fails in alua_init, it should return -ENOMEM, otherwise
it will trigger null-ptr-deref while unloading module which calls
destroy_workqueue dereference
wq->lock like this:

BUG: KASAN: null-ptr-deref in __lock_acquire+0x6b4/0x1ee0
Read of size 8 at addr 0000000000000080 by task syz-executor.0/7045

CPU: 0 PID: 7045 Comm: syz-executor.0 Tainted: G         C        5.1.0+ #28
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1ubuntu1
Call Trace:
 dump_stack+0xa9/0x10e
 __kasan_report+0x171/0x18d
 ? __lock_acquire+0x6b4/0x1ee0
 kasan_report+0xe/0x20
 __lock_acquire+0x6b4/0x1ee0
 lock_acquire+0xb4/0x1b0
 __mutex_lock+0xd8/0xb90
 drain_workqueue+0x25/0x290
 destroy_workqueue+0x1f/0x3f0
 __x64_sys_delete_module+0x244/0x330
 do_syscall_64+0x72/0x2a0
 entry_SYSCALL_64_after_hwframe+0x49/0xbe

Reported-by: Hulk Robot <hulkci@huawei.com>
Fixes: 03197b6 ("scsi_dh_alua: Use workqueue for RTPG")
Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
dabrace pushed a commit that referenced this pull request Jun 18, 2019
If spi_register_master fails in spi_bitbang_start
because device_add failure, We should return the
error code other than 0, otherwise calling
spi_bitbang_stop may trigger NULL pointer dereference
like this:

BUG: KASAN: null-ptr-deref in __list_del_entry_valid+0x45/0xd0
Read of size 8 at addr 0000000000000000 by task syz-executor.0/3661

CPU: 0 PID: 3661 Comm: syz-executor.0 Not tainted 5.1.0+ #28
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1ubuntu1 04/01/2014
Call Trace:
 dump_stack+0xa9/0x10e
 ? __list_del_entry_valid+0x45/0xd0
 ? __list_del_entry_valid+0x45/0xd0
 __kasan_report+0x171/0x18d
 ? __list_del_entry_valid+0x45/0xd0
 kasan_report+0xe/0x20
 __list_del_entry_valid+0x45/0xd0
 spi_unregister_controller+0x99/0x1b0
 spi_lm70llp_attach+0x3ae/0x4b0 [spi_lm70llp]
 ? 0xffffffffc1128000
 ? klist_next+0x131/0x1e0
 ? driver_detach+0x40/0x40 [parport]
 port_check+0x3b/0x50 [parport]
 bus_for_each_dev+0x115/0x180
 ? subsys_dev_iter_exit+0x20/0x20
 __parport_register_driver+0x1f0/0x210 [parport]
 ? 0xffffffffc1150000
 do_one_initcall+0xb9/0x3b5
 ? perf_trace_initcall_level+0x270/0x270
 ? kasan_unpoison_shadow+0x30/0x40
 ? kasan_unpoison_shadow+0x30/0x40
 do_init_module+0xe0/0x330
 load_module+0x38eb/0x4270
 ? module_frob_arch_sections+0x20/0x20
 ? kernel_read_file+0x188/0x3f0
 ? find_held_lock+0x6d/0xd0
 ? fput_many+0x1a/0xe0
 ? __do_sys_finit_module+0x162/0x190
 __do_sys_finit_module+0x162/0x190
 ? __ia32_sys_init_module+0x40/0x40
 ? __mutex_unlock_slowpath+0xb4/0x3f0
 ? wait_for_completion+0x240/0x240
 ? vfs_write+0x160/0x2a0
 ? lockdep_hardirqs_off+0xb5/0x100
 ? mark_held_locks+0x1a/0x90
 ? do_syscall_64+0x14/0x2a0
 do_syscall_64+0x72/0x2a0
 entry_SYSCALL_64_after_hwframe+0x49/0xbe

Reported-by: Hulk Robot <hulkci@huawei.com>
Fixes: 702a487 ("spi: bitbang: Let spi_bitbang_start() take a reference to master")
Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be>
Reviewed-by: Axel Lin <axel.lin@ingics.com>
Reviewed-by: Mukesh Ojha <mojha@codeaurora.org>
Signed-off-by: Mark Brown <broonie@kernel.org>
dabrace pushed a commit that referenced this pull request Sep 30, 2019
The p9_tag_alloc() does not initialize the transport error t_err field.
The struct p9_req_t *req is allocated and stored in a struct p9_client
variable. The field t_err is never initialized before p9_conn_cancel()
checks its value.

KUMSAN(KernelUninitializedMemorySantizer, a new error detection tool)
reports this bug.

==================================================================
BUG: KUMSAN: use of uninitialized memory in p9_conn_cancel+0x2d9/0x3b0
Read of size 4 at addr ffff88805f9b600c by task kworker/1:2/1216

CPU: 1 PID: 1216 Comm: kworker/1:2 Not tainted 5.2.0-rc4+ #28
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Ubuntu-1.8.2-1ubuntu1 04/01/2014
Workqueue: events p9_write_work
Call Trace:
 dump_stack+0x75/0xae
 __kumsan_report+0x17c/0x3e6
 kumsan_report+0xe/0x20
 p9_conn_cancel+0x2d9/0x3b0
 p9_write_work+0x183/0x4a0
 process_one_work+0x4d1/0x8c0
 worker_thread+0x6e/0x780
 kthread+0x1ca/0x1f0
 ret_from_fork+0x35/0x40

Allocated by task 1979:
 save_stack+0x19/0x80
 __kumsan_kmalloc.constprop.3+0xbc/0x120
 kmem_cache_alloc+0xa7/0x170
 p9_client_prepare_req.part.9+0x3b/0x380
 p9_client_rpc+0x15e/0x880
 p9_client_create+0x3d0/0xac0
 v9fs_session_init+0x192/0xc80
 v9fs_mount+0x67/0x470
 legacy_get_tree+0x70/0xd0
 vfs_get_tree+0x4a/0x1c0
 do_mount+0xba9/0xf90
 ksys_mount+0xa8/0x120
 __x64_sys_mount+0x62/0x70
 do_syscall_64+0x6d/0x1e0
 entry_SYSCALL_64_after_hwframe+0x44/0xa9

Freed by task 0:
(stack is not available)

The buggy address belongs to the object at ffff88805f9b6008
 which belongs to the cache p9_req_t of size 144
The buggy address is located 4 bytes inside of
 144-byte region [ffff88805f9b6008, ffff88805f9b6098)
The buggy address belongs to the page:
page:ffffea00017e6d80 refcount:1 mapcount:0 mapping:ffff888068b63740 index:0xffff88805f9b7d90 compound_mapcount: 0
flags: 0x100000000010200(slab|head)
raw: 0100000000010200 ffff888068b66450 ffff888068b66450 ffff888068b63740
raw: ffff88805f9b7d90 0000000000100001 00000001ffffffff 0000000000000000
page dumped because: kumsan: bad access detected
==================================================================

Link: http://lkml.kernel.org/r/20190613070854.10434-1-shuaibinglu@126.com
Signed-off-by: Lu Shuaibing <shuaibinglu@126.com>
[dominique.martinet@cea.fr: grouped the added init with the others]
Signed-off-by: Dominique Martinet <dominique.martinet@cea.fr>
dabrace pushed a commit that referenced this pull request Jul 11, 2020
We should not trigger a warning when a memory allocation fails. Remove
the WARN_ON().

The warning is constantly triggered by syzkaller when it is injecting
faults:

[ 2230.758664] FAULT_INJECTION: forcing a failure.
[ 2230.758664] name failslab, interval 1, probability 0, space 0, times 0
[ 2230.762329] CPU: 3 PID: 1407 Comm: syz-executor.0 Not tainted 5.8.0-rc2+ #28
...
[ 2230.898175] WARNING: CPU: 3 PID: 1407 at drivers/net/ethernet/mellanox/mlxsw/spectrum_router.c:6265 mlxsw_sp_router_fib_event+0xfad/0x13e0
[ 2230.898179] Kernel panic - not syncing: panic_on_warn set ...
[ 2230.898183] CPU: 3 PID: 1407 Comm: syz-executor.0 Not tainted 5.8.0-rc2+ #28
[ 2230.898190] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.12.1-0-ga5cab58e9a3f-prebuilt.qemu.org 04/01/2014

Fixes: 3057224 ("mlxsw: spectrum_router: Implement FIB offload in deferred work")
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
dabrace pushed a commit that referenced this pull request May 6, 2021
Ritesh reported a bug [1] against UML, noting that it crashed on
startup. The backtrace shows the following (heavily redacted):

(gdb) bt
...
 #26 0x0000000060015b5d in sem_init () at ipc/sem.c:268
 #27 0x00007f89906d92f7 in ?? () from /lib/x86_64-linux-gnu/libcom_err.so.2
 #28 0x00007f8990ab8fb2 in call_init (...) at dl-init.c:72
...
 #40 0x00007f89909bf3a6 in nss_load_library (...) at nsswitch.c:359
...
 #44 0x00007f8990895e35 in _nss_compat_getgrnam_r (...) at nss_compat/compat-grp.c:486
 #45 0x00007f8990968b85 in __getgrnam_r [...]
 #46 0x00007f89909d6b77 in grantpt [...]
 #47 0x00007f8990a9394e in __GI_openpty [...]
 #48 0x00000000604a1f65 in openpty_cb (...) at arch/um/os-Linux/sigio.c:407
 #49 0x00000000604a58d0 in start_idle_thread (...) at arch/um/os-Linux/skas/process.c:598
 #50 0x0000000060004a3d in start_uml () at arch/um/kernel/skas/process.c:45
 #51 0x00000000600047b2 in linux_main (...) at arch/um/kernel/um_arch.c:334
 #52 0x000000006000574f in main (...) at arch/um/os-Linux/main.c:144

indicating that the UML function openpty_cb() calls openpty(),
which internally calls __getgrnam_r(), which causes the nsswitch
machinery to get started.

This loads, through lots of indirection that I snipped, the
libcom_err.so.2 library, which (in an unknown function, "??")
calls sem_init().

Now, of course it wants to get libpthread's sem_init(), since
it's linked against libpthread. However, the dynamic linker
looks up that symbol against the binary first, and gets the
kernel's sem_init().

Hajime Tazaki noted that "objcopy -L" can localize a symbol,
so the dynamic linker wouldn't do the lookup this way. I tried,
but for some reason that didn't seem to work.

Doing the same thing in the linker script instead does seem to
work, though I cannot entirely explain - it *also* works if I
just add "VERSION { { global: *; }; }" instead, indicating that
something else is happening that I don't really understand. It
may be that explicitly doing that marks them with some kind of
empty version, and that's different from the default.

Explicitly marking them with a version breaks kallsyms, so that
doesn't seem to be possible.

Marking all the symbols as local seems correct, and does seem
to address the issue, so do that. Also do it for static link,
nsswitch libraries could still be loaded there.

[1] https://bugs.debian.org/983379

Reported-by: Ritesh Raj Sarraf <rrs@debian.org>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Acked-By: Anton Ivanov <anton.ivanov@cambridgegreys.com>
Tested-By: Ritesh Raj Sarraf <rrs@debian.org>
Signed-off-by: Richard Weinberger <richard@nod.at>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.