[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CACT4Y+bvLj+qVYXf1fQuf_NKdCzkuDWs5+r-PomTdCU2MOkP5g@mail.gmail.com>
Date: Tue, 2 Aug 2022 08:59:01 +0200
From: Dmitry Vyukov <dvyukov@...gle.com>
To: Vlastimil Babka <vbabka@...e.cz>
Cc: Feng Tang <feng.tang@...el.com>,
"Sang, Oliver" <oliver.sang@...el.com>, lkp <lkp@...el.com>,
LKML <linux-kernel@...r.kernel.org>,
"linux-mm@...ck.org" <linux-mm@...ck.org>,
"lkp@...ts.01.org" <lkp@...ts.01.org>,
Andrew Morton <akpm@...ux-foundation.org>,
Christoph Lameter <cl@...ux.com>,
Pekka Enberg <penberg@...nel.org>,
David Rientjes <rientjes@...gle.com>,
Joonsoo Kim <iamjoonsoo.kim@....com>,
Roman Gushchin <roman.gushchin@...ux.dev>,
Hyeonggon Yoo <42.hyeyoo@...il.com>,
"Hansen, Dave" <dave.hansen@...el.com>,
Robin Murphy <robin.murphy@....com>,
John Garry <john.garry@...wei.com>,
Kefeng Wang <wangkefeng.wang@...wei.com>,
Andrey Konovalov <andreyknvl@...il.com>,
Andrey Ryabinin <ryabinin.a.a@...il.com>,
Alexander Potapenko <glider@...gle.com>,
kasan-dev@...glegroups.com
Subject: Re: [mm/slub] 3616799128: BUG_kmalloc-#(Not_tainted):kmalloc_Redzone_overwritten
On Mon, 1 Aug 2022 at 16:23, Vlastimil Babka <vbabka@...e.cz> wrote:
>
> On 8/1/22 08:21, Feng Tang wrote:
> > On Sun, Jul 31, 2022 at 04:16:53PM +0800, Tang, Feng wrote:
> >> Hi Oliver,
> >>
> >> On Sun, Jul 31, 2022 at 02:53:17PM +0800, Sang, Oliver wrote:
> >> >
> >> >
> >> > Greeting,
> >> >
> >> > FYI, we noticed the following commit (built with gcc-11):
> >> >
> >> > commit: 3616799128612e04ed919579e2c7b0dccf6bcb00 ("[PATCH v3 3/3] mm/slub: extend redzone check to cover extra allocated kmalloc space than requested")
> >> > url: https://github.com/intel-lab-lkp/linux/commits/Feng-Tang/mm-slub-some-debug-enhancements/20220727-151318
> >> > base: git://git.kernel.org/cgit/linux/kernel/git/vbabka/slab.git for-next
> >> > patch link: https://lore.kernel.org/linux-mm/20220727071042.8796-4-feng.tang@intel.com
> >> >
> >> > in testcase: boot
> >> >
> >> > on test machine: qemu-system-x86_64 -enable-kvm -cpu SandyBridge -smp 2 -m 16G
> >> >
> >> > caused below changes (please refer to attached dmesg/kmsg for entire log/backtrace):
> >> >
> >> >
> >> > If you fix the issue, kindly add following tag
> >> > Reported-by: kernel test robot <oliver.sang@...el.com>
> >> >
> >> >
> >> > [ 50.637839][ T154] =============================================================================
> >> > [ 50.639937][ T154] BUG kmalloc-16 (Not tainted): kmalloc Redzone overwritten
> >> > [ 50.641291][ T154] -----------------------------------------------------------------------------
> >> > [ 50.641291][ T154]
> >> > [ 50.643617][ T154] 0xffff88810018464c-0xffff88810018464f @offset=1612. First byte 0x7 instead of 0xcc
> >> > [ 50.645311][ T154] Allocated in __sdt_alloc+0x258/0x457 age=14287 cpu=0 pid=1
> >> > [ 50.646584][ T154] ___slab_alloc+0x52b/0x5b6
> >> > [ 50.647411][ T154] __slab_alloc+0x1a/0x22
> >> > [ 50.648374][ T154] __kmalloc_node+0x10c/0x1e1
> >> > [ 50.649237][ T154] __sdt_alloc+0x258/0x457
> >> > [ 50.650060][ T154] build_sched_domains+0xae/0x10e8
> >> > [ 50.650981][ T154] sched_init_smp+0x30/0xa5
> >> > [ 50.651805][ T154] kernel_init_freeable+0x1c6/0x23b
> >> > [ 50.652767][ T154] kernel_init+0x14/0x127
> >> > [ 50.653594][ T154] ret_from_fork+0x1f/0x30
> >> > [ 50.654414][ T154] Slab 0xffffea0004006100 objects=28 used=28 fp=0x0000000000000000 flags=0x1fffc0000000201(locked|slab|node=0|zone=1|lastcpupid=0x3fff)
> >> > [ 50.656866][ T154] Object 0xffff888100184640 @offset=1600 fp=0xffff888100184520
> >> > [ 50.656866][ T154]
> >> > [ 50.658410][ T154] Redzone ffff888100184630: cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc ................
> >> > [ 50.660047][ T154] Object ffff888100184640: 00 32 80 00 81 88 ff ff 01 00 00 00 07 00 80 8a .2..............
> >> > [ 50.661837][ T154] Redzone ffff888100184650: cc cc cc cc cc cc cc cc ........
> >> > [ 50.663454][ T154] Padding ffff8881001846b4: 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a ZZZZZZZZZZZZ
> >> > [ 50.665225][ T154] CPU: 0 PID: 154 Comm: systemd-udevd Not tainted 5.19.0-rc5-00010-g361679912861 #1
> >> > [ 50.666861][ T154] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.0-debian-1.16.0-4 04/01/2014
> >> > [ 50.668694][ T154] Call Trace:
> >> > [ 50.669331][ T154] <TASK>
> >> > [ 50.669832][ T154] dump_stack_lvl+0x57/0x7d
> >> > [ 50.670601][ T154] check_bytes_and_report+0xca/0xfe
> >> > [ 50.671436][ T154] check_object+0xdc/0x24d
> >> > [ 50.672163][ T154] free_debug_processing+0x98/0x210
> >> > [ 50.673904][ T154] __slab_free+0x46/0x198
> >> > [ 50.675746][ T154] qlist_free_all+0xae/0xde
> >> > [ 50.676552][ T154] kasan_quarantine_reduce+0x10d/0x145
> >> > [ 50.677507][ T154] __kasan_slab_alloc+0x1c/0x5a
> >> > [ 50.678327][ T154] slab_post_alloc_hook+0x5a/0xa2
> >> > [ 50.680069][ T154] kmem_cache_alloc+0x102/0x135
> >> > [ 50.680938][ T154] getname_flags+0x4b/0x314
> >> > [ 50.681781][ T154] do_sys_openat2+0x7a/0x15c
> >> > [ 50.706848][ T154] Disabling lock debugging due to kernel taint
> >> > [ 50.707913][ T154] FIX kmalloc-16: Restoring kmalloc Redzone 0xffff88810018464c-0xffff88810018464f=0xcc
> >>
> >> Thanks for the report!
> >>
> >> From the log it happened when kasan is enabled, and my first guess is
> >> the data processing from kmalloc redzone handling had some conflict
> >> with kasan's in allocation path (though I tested some kernel config
> >> with KASAN enabled)
> >>
> >> Will study more about kasan and reproduce/debug this. thanks
> >
> > Cc kansan mail list.
> >
> > This is really related with KASAN debug, that in free path, some
> > kmalloc redzone ([orig_size+1, object_size]) area is written by
> > kasan to save free meta info.
> >
> > The callstack is:
> >
> > kfree
> > slab_free
> > slab_free_freelist_hook
> > slab_free_hook
> > __kasan_slab_free
> > ____kasan_slab_free
> > kasan_set_free_info
> > kasan_set_track
> >
> > And this issue only happens with "kmalloc-16" slab. Kasan has 2
> > tracks: alloc_track and free_track, for x86_64 test platform, most
> > of the slabs will reserve space for alloc_track, and reuse the
> > 'object' area for free_track. The kasan free_track is 16 bytes
> > large, that it will occupy the whole 'kmalloc-16's object area,
> > so when kmalloc-redzone is enabled by this patch, the 'overwritten'
> > error is triggered.
> >
> > But it won't hurt other kmalloc slabs, as kasan's free meta won't
> > conflict with kmalloc-redzone which stay in the latter part of
> > kmalloc area.
> >
> > So the solution I can think of is:
> > * skip the kmalloc-redzone for kmalloc-16 only, or
> > * skip kmalloc-redzone if kasan is enabled, or
> > * let kasan reserve the free meta (16 bytes) outside of object
> > just like for alloc meta
>
> Maybe we could add some hack that if both kasan and SLAB_STORE_USER is
> enabled, we bump the stored orig_size from <16 to 16? Similar to what
> __ksize() does.
This looks like the simplest workaround. And with a proper comment I
think it's fine.
> > I don't have way to test kasan's SW/HW tag configuration, which
> > is only enabled on arm64 now. And I don't know if there will
> > also be some conflict.
> >
> > Thanks,
> > Feng
> >
>
Powered by blists - more mailing lists