Skip to content

Commit

Permalink
mm/zsmalloc.c: drop ZSMALLOC_PGTABLE_MAPPING
Browse files Browse the repository at this point in the history
While I was doing zram testing, I found sometimes decompression failed
since the compression buffer was corrupted.  With investigation, I found
below commit calls cond_resched unconditionally so it could make a problem
in atomic context if the task is reschedule.

[   55.109012] BUG: sleeping function called from invalid context at mm/vmalloc.c:108
[   55.110774] in_atomic(): 1, irqs_disabled(): 0, non_block: 0, pid: 946, name: memhog
[   55.111973] 3 locks held by memhog/946:
[   55.112807]  #0: ffff9d01d4b193e8 (&mm->mmap_lock#2){++++}-{4:4}, at: __mm_populate+0x103/0x160
[   55.114151]  #1: ffffffffa3d53de0 (fs_reclaim){+.+.}-{0:0}, at: __alloc_pages_slowpath.constprop.0+0xa98/0x1160
[   55.115848]  #2: ffff9d01d56b8110 (&zspage->lock){.+.+}-{3:3}, at: zs_map_object+0x8e/0x1f0
[   55.118947] CPU: 0 PID: 946 Comm: memhog Not tainted 5.9.3-00011-gc5bfc0287345-dirty torvalds#316
[   55.121265] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.13.0-1 04/01/2014
[   55.122540] Call Trace:
[   55.122974]  dump_stack+0x8b/0xb8
[   55.123588]  ___might_sleep.cold+0xb6/0xc6
[   55.124328]  unmap_kernel_range_noflush+0x2eb/0x350
[   55.125198]  unmap_kernel_range+0x14/0x30
[   55.125920]  zs_unmap_object+0xd5/0xe0
[   55.126604]  zram_bvec_rw.isra.0+0x38c/0x8e0
[   55.127462]  zram_rw_page+0x90/0x101
[   55.128199]  bdev_write_page+0x92/0xe0
[   55.128957]  ? swap_slot_free_notify+0xb0/0xb0
[   55.129841]  __swap_writepage+0x94/0x4a0
[   55.130636]  ? do_raw_spin_unlock+0x4b/0xa0
[   55.131462]  ? _raw_spin_unlock+0x1f/0x30
[   55.132261]  ? page_swapcount+0x6c/0x90
[   55.133038]  pageout+0xe3/0x3a0
[   55.133702]  shrink_page_list+0xb94/0xd60
[   55.134626]  shrink_inactive_list+0x158/0x460

We can fix this by removing the ZSMALLOC_PGTABLE_MAPPING feature (which
contains the offending calling code) from zsmalloc.

Even though this option showed some amount improvement(e.g., 30%) in some
arm32 platforms, it has been headache to maintain since it have abused
APIs[1](e.g., unmap_kernel_range in atomic context).

Since we are approaching to deprecate 32bit machines and already made the
config option available for only builtin build since v5.8, lastly it has
been not default option in zsmalloc, it's time to drop the option for
better maintenance.

[1] http://lore.kernel.org/linux-mm/20201105170249.387069-1-minchan@kernel.org

Link: https://lkml.kernel.org/r/20201117202916.GA3856507@google.com
Fixes: e47110e ("mm/vunmap: add cond_resched() in vunmap_pmd_range")
Signed-off-by: Minchan Kim <minchan@kernel.org>
Reviewed-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Cc: Tony Lindgren <tony@atomide.com>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: Harish Sriram <harish@linux.ibm.com>
Cc: Uladzislau Rezki <urezki@gmail.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
  • Loading branch information
minchank authored and sfrothwell committed Dec 3, 2020
1 parent e276a5b commit cd59e0a
Show file tree
Hide file tree
Showing 4 changed files with 0 additions and 69 deletions.
1 change: 0 additions & 1 deletion arch/arm/configs/omap2plus_defconfig
Original file line number Diff line number Diff line change
Expand Up @@ -81,7 +81,6 @@ CONFIG_PARTITION_ADVANCED=y
CONFIG_BINFMT_MISC=y
CONFIG_CMA=y
CONFIG_ZSMALLOC=m
CONFIG_ZSMALLOC_PGTABLE_MAPPING=y
CONFIG_NET=y
CONFIG_PACKET=y
CONFIG_UNIX=y
Expand Down
1 change: 0 additions & 1 deletion include/linux/zsmalloc.h
Original file line number Diff line number Diff line change
Expand Up @@ -20,7 +20,6 @@
* zsmalloc mapping modes
*
* NOTE: These only make a difference when a mapped object spans pages.
* They also have no effect when ZSMALLOC_PGTABLE_MAPPING is selected.
*/
enum zs_mapmode {
ZS_MM_RW, /* normal read-write mapping */
Expand Down
13 changes: 0 additions & 13 deletions mm/Kconfig
Original file line number Diff line number Diff line change
Expand Up @@ -707,19 +707,6 @@ config ZSMALLOC
returned by an alloc(). This handle must be mapped in order to
access the allocated space.

config ZSMALLOC_PGTABLE_MAPPING
bool "Use page table mapping to access object in zsmalloc"
depends on ZSMALLOC=y
help
By default, zsmalloc uses a copy-based object mapping method to
access allocations that span two pages. However, if a particular
architecture (ex, ARM) performs VM mapping faster than copying,
then you should select this. This causes zsmalloc to use page table
mapping rather than copying for object mapping.

You can check speed with zsmalloc benchmark:
https://github.com/spartacus06/zsmapbench

config ZSMALLOC_STAT
bool "Export zsmalloc statistics"
depends on ZSMALLOC
Expand Down
54 changes: 0 additions & 54 deletions mm/zsmalloc.c
Original file line number Diff line number Diff line change
Expand Up @@ -293,11 +293,7 @@ struct zspage {
};

struct mapping_area {
#ifdef CONFIG_ZSMALLOC_PGTABLE_MAPPING
struct vm_struct *vm; /* vm area for mapping object that span pages */
#else
char *vm_buf; /* copy buffer for objects that span pages */
#endif
char *vm_addr; /* address of kmap_atomic()'ed pages */
enum zs_mapmode vm_mm; /* mapping mode */
};
Expand Down Expand Up @@ -1113,54 +1109,6 @@ static struct zspage *find_get_zspage(struct size_class *class)
return zspage;
}

#ifdef CONFIG_ZSMALLOC_PGTABLE_MAPPING
static inline int __zs_cpu_up(struct mapping_area *area)
{
/*
* Make sure we don't leak memory if a cpu UP notification
* and zs_init() race and both call zs_cpu_up() on the same cpu
*/
if (area->vm)
return 0;
area->vm = get_vm_area(PAGE_SIZE * 2, 0);
if (!area->vm)
return -ENOMEM;

/*
* Populate ptes in advance to avoid pte allocation with GFP_KERNEL
* in non-preemtible context of zs_map_object.
*/
return apply_to_page_range(&init_mm, (unsigned long)area->vm->addr,
PAGE_SIZE * 2, NULL, NULL);
}

static inline void __zs_cpu_down(struct mapping_area *area)
{
if (area->vm)
free_vm_area(area->vm);
area->vm = NULL;
}

static inline void *__zs_map_object(struct mapping_area *area,
struct page *pages[2], int off, int size)
{
unsigned long addr = (unsigned long)area->vm->addr;

BUG_ON(map_kernel_range(addr, PAGE_SIZE * 2, PAGE_KERNEL, pages) < 0);
area->vm_addr = area->vm->addr;
return area->vm_addr + off;
}

static inline void __zs_unmap_object(struct mapping_area *area,
struct page *pages[2], int off, int size)
{
unsigned long addr = (unsigned long)area->vm_addr;

unmap_kernel_range(addr, PAGE_SIZE * 2);
}

#else /* CONFIG_ZSMALLOC_PGTABLE_MAPPING */

static inline int __zs_cpu_up(struct mapping_area *area)
{
/*
Expand Down Expand Up @@ -1241,8 +1189,6 @@ static void __zs_unmap_object(struct mapping_area *area,
pagefault_enable();
}

#endif /* CONFIG_ZSMALLOC_PGTABLE_MAPPING */

static int zs_cpu_prepare(unsigned int cpu)
{
struct mapping_area *area;
Expand Down

0 comments on commit cd59e0a

Please sign in to comment.