[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <aO1FYlFwnVajiB8V@hyeyoo>
Date: Tue, 14 Oct 2025 03:30:58 +0900
From: Harry Yoo <harry.yoo@...cle.com>
To: Vlastimil Babka <vbabka@...e.cz>
Cc: kernel test robot <oliver.sang@...el.com>,
Alexei Starovoitov <ast@...nel.org>, oe-lkp@...ts.linux.dev,
lkp@...el.com, linux-kernel@...r.kernel.org,
kasan-dev@...glegroups.com, cgroups@...r.kernel.org,
linux-mm@...ck.org
Subject: Re: [linus:master] [slab] af92793e52:
BUG_kmalloc-#(Not_tainted):Freepointer_corrupt
On Mon, Oct 13, 2025 at 04:23:09PM +0200, Vlastimil Babka wrote:
> On 10/13/25 11:44, Harry Yoo wrote:
> > On Fri, Oct 10, 2025 at 04:39:12PM +0800, kernel test robot wrote:
> >>
> >>
> >> Hello,
> >>
> >> kernel test robot noticed "BUG_kmalloc-#(Not_tainted):Freepointer_corrupt" on:
> >>
> >> commit: af92793e52c3a99b828ed4bdd277fd3e11c18d08 ("slab: Introduce kmalloc_nolock() and kfree_nolock().")
> >> https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git
> >>
> >> [test failed on linus/master ec714e371f22f716a04e6ecb2a24988c92b26911]
> >> [test failed on linux-next/master 0b2f041c47acb45db82b4e847af6e17eb66cd32d]
> >> [test failed on fix commit 83d59d81b20c09c256099d1c15d7da21969581bd]
> >>
> >> in testcase: trinity
> >> version: trinity-i386-abe9de86-1_20230429
> >> with following parameters:
> >>
> >> runtime: 300s
> >> group: group-01
> >> nr_groups: 5
> >>
> >> config: i386-randconfig-012-20251004
> >> compiler: gcc-14
> >> test machine: qemu-system-x86_64 -enable-kvm -cpu SandyBridge -smp 2 -m 16G
> >>
> >> (please refer to attached dmesg/kmsg for entire log/backtrace)
> >>
> >> If you fix the issue in a separate patch/commit (i.e. not just a new version of
> >> the same patch/commit), kindly add following tags
> >> | Reported-by: kernel test robot <oliver.sang@...el.com>
> >> | Closes: https://lore.kernel.org/oe-lkp/202510101652.7921fdc6-lkp@intel.com
> >>
> >> [ 66.142496][ C0] =============================================================================
> >> [ 66.146355][ C0] BUG kmalloc-96 (Not tainted): Freepointer corrupt
> >> [ 66.147370][ C0] -----------------------------------------------------------------------------
> >> [ 66.147370][ C0]
> >> [ 66.149155][ C0] Allocated in alloc_slab_obj_exts+0x33c/0x460 age=7 cpu=0 pid=3651
> >> [ 66.150496][ C0] kmalloc_nolock_noprof (mm/slub.c:4798 mm/slub.c:5658)
> >> [ 66.151371][ C0] alloc_slab_obj_exts (mm/slub.c:2102 (discriminator 3))
> >> [ 66.152250][ C0] __alloc_tagging_slab_alloc_hook (mm/slub.c:2208 (discriminator 1) mm/slub.c:2224 (discriminator 1))
> >> [ 66.153248][ C0] __kmalloc_cache_noprof (mm/slub.c:5698)
> >> [ 66.154093][ C0] set_mm_walk (include/linux/slab.h:953 include/linux/slab.h:1090 mm/vmscan.c:3852)
> >> [ 66.154810][ C0] try_to_inc_max_seq (mm/vmscan.c:4077)
> >> [ 66.155627][ C0] try_to_shrink_lruvec (mm/vmscan.c:4860 mm/vmscan.c:4903)
> >> [ 66.156512][ C0] shrink_node (mm/vmscan.c:4952 mm/vmscan.c:5091 mm/vmscan.c:6078)
> >> [ 66.157363][ C0] do_try_to_free_pages (mm/vmscan.c:6336 mm/vmscan.c:6398)
> >> [ 66.158233][ C0] try_to_free_pages (mm/vmscan.c:6644)
> >> [ 66.159023][ C0] __alloc_pages_slowpath+0x28b/0x6e0
> >> [ 66.159977][ C0] __alloc_frozen_pages_noprof (mm/page_alloc.c:5161)
> >> [ 66.160941][ C0] __folio_alloc_noprof (mm/page_alloc.c:5183 mm/page_alloc.c:5192)
> >> [ 66.161739][ C0] shmem_alloc_and_add_folio+0x40/0x200
> >> [ 66.162752][ C0] shmem_get_folio_gfp+0x30b/0x880
> >> [ 66.163649][ C0] shmem_fallocate (mm/shmem.c:3813)
> >> [ 66.164498][ C0] Freed in kmem_cache_free_bulk+0x1b/0x50 age=89 cpu=1 pid=248
> >
> >> [ 66.169568][ C0] kmem_cache_free_bulk (mm/slub.c:4875 (discriminator 3) mm/slub.c:5197 (discriminator 3) mm/slub.c:5228 (discriminator 3))
> >> [ 66.170518][ C0] kmem_cache_free_bulk (mm/slub.c:7226)
> >> [ 66.171368][ C0] kvfree_rcu_bulk (include/linux/slab.h:827 mm/slab_common.c:1522)
> >> [ 66.172133][ C0] kfree_rcu_monitor (mm/slab_common.c:1728 (discriminator 3) mm/slab_common.c:1802 (discriminator 3))
> >> [ 66.173002][ C0] kfree_rcu_shrink_scan (mm/slab_common.c:2155)
> >> [ 66.173852][ C0] do_shrink_slab (mm/shrinker.c:438)
> >> [ 66.174640][ C0] shrink_slab (mm/shrinker.c:665)
> >> [ 66.175446][ C0] shrink_node (mm/vmscan.c:338 (discriminator 1) mm/vmscan.c:4960 (discriminator 1) mm/vmscan.c:5091 (discriminator 1) mm/vmscan.c:6078 (discriminator 1))
> >> [ 66.176205][ C0] do_try_to_free_pages (mm/vmscan.c:6336 mm/vmscan.c:6398)
> >> [ 66.177017][ C0] try_to_free_pages (mm/vmscan.c:6644)
> >> [ 66.177808][ C0] __alloc_pages_slowpath+0x28b/0x6e0
> >> [ 66.178851][ C0] __alloc_frozen_pages_noprof (mm/page_alloc.c:5161)
> >> [ 66.179753][ C0] __folio_alloc_noprof (mm/page_alloc.c:5183 mm/page_alloc.c:5192)
> >> [ 66.180583][ C0] folio_prealloc+0x36/0x160
> >> [ 66.181430][ C0] do_anonymous_page (mm/memory.c:4997 mm/memory.c:5054)
> >> [ 66.182288][ C0] do_pte_missing (mm/memory.c:4232)
> >
> > So here we are freeing an object that is allocated via kmalloc_nolock().
> > (And before being allocated via kmalloc_nolock(), it was freed via
> > kfree_rcu()).
> >
> >> [ 66.183062][ C0] Slab 0xe41bfb28 objects=21 used=17 fp=0xedf89320 flags=0x40000200(workingset|zone=1)
> >> [ 66.184609][ C0] Object 0xedf89b60 @offset=2912 fp=0xeac7a8b4
> >
> > fp=0xeac7a8b4
> >
> > the address of the object is: 0xedf89b60.
> >
> > 0xedf89b60 - 0xeac7a8b4 = 0x330f2ac
> >
> > If FP was not corrupted, the object pointed to by FP is
> > too far away for them to be in the same slab.
> >
> > That may suggest that some code built a list of free objects
> > across multiple slabs/caches. That's what deferred free does!
> >
> > But in free_deferred_objects(), we have:
> >> /*
> >> * In PREEMPT_RT irq_work runs in per-cpu kthread, so it's safe
> >> * to take sleeping spin_locks from __slab_free() and deactivate_slab().
> >> * In !PREEMPT_RT irq_work will run after local_unlock_irqrestore().
> >> */
> >> static void free_deferred_objects(struct irq_work *work)
> >> {
> >> struct defer_free *df = container_of(work, struct defer_free, work);
> >> struct llist_head *objs = &df->objects;
> >> struct llist_head *slabs = &df->slabs;
> >> struct llist_node *llnode, *pos, *t;
> >>
> >> if (llist_empty(objs) && llist_empty(slabs))
> >> return;
> >>
> >> llnode = llist_del_all(objs);
> >> llist_for_each_safe(pos, t, llnode) {
> >> struct kmem_cache *s;
> >> struct slab *slab;
> >> void *x = pos;
> >>
> >> slab = virt_to_slab(x);
> >> s = slab->slab_cache;
> >>
> >> /*
> >> * We used freepointer in 'x' to link 'x' into df->objects.
> >> * Clear it to NULL to avoid false positive detection
> >> * of "Freepointer corruption".
> >> */
> >> *(void **)x = NULL;
>
> Oh wait, isn't it just the case that this is not using set_freepointer() and
> with CONFIG_SLAB_FREELIST_HARDENED even the NULL is encoded as a non-NULL?
Oh, great observation! Obviously it should be fixed.
The fix posted in the other email looks great to me.
--
Cheers,
Harry / Hyeonggon
> >>
> >> /* Point 'x' back to the beginning of allocated object */
> >> x -= s->offset;
> >> __slab_free(s, slab, x, x, 1, _THIS_IP_);
> >> }
> >>
> >
> > This should have cleared the FP before freeing it.
> >
> > Oh wait, there are more in the dmesg:
> >> [ 67.073014][ C1] ------------[ cut here ]------------
> >> [ 67.074039][ C1] WARNING: CPU: 1 PID: 3894 at mm/slub.c:1209 object_err+0x4d/0x6d
> >> [ 67.075394][ C1] Modules linked in: evdev serio_raw tiny_power_button fuse drm drm_panel_orientation_quirks stm_p_basic
> >> [ 67.077222][ C1] CPU: 1 UID: 0 PID: 3894 Comm: sed Tainted: G B W 6.17.0-rc3-00014-gaf92793e52c3 #1 PREEMPTLAZY 2cffa6c1ad8b595a5f5738a3e143d70494d8da79
> >> [ 67.079495][ C1] Tainted: [B]=BAD_PAGE, [W]=WARN
> >> [ 67.080303][ C1] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.3-debian-1.16.3-2 04/01/2014
> >> [ 67.085915][ C1] EIP: object_err+0x4d/0x6d
> >> [ 67.086691][ C1] Code: 8b 45 fc e8 95 fe ff ff ba 01 00 00 00 b8 05 00 00 00 e8 46 1e 12 00 6a 01 31 c9 ba 01 00 00 00 b8 f8 84 76 db e8 b3 e1 2b 00 <0f> 0b 6a 01 31 c9 ba 01 00 00 00 b8 e0 84 76 db e8 9e e1 2b 00 83
> >> [ 67.089537][ C1] EAX: 00000000 EBX: c10012c0 ECX: 00000000 EDX: 00000000
> >> [ 67.090581][ C1] ESI: aacfa894 EDI: edf89320 EBP: ed7477b8 ESP: ed7477a0
> >> [ 67.091578][ C1] DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068 EFLAGS: 00010046
> >> [ 67.092767][ C1] CR0: 80050033 CR2: b7fa58c8 CR3: 01b5b000 CR4: 000406d0
> >> [ 67.093840][ C1] Call Trace:
> >> [ 67.094450][ C1] check_object.cold+0x11/0x17
> >> [ 67.095280][ C1] free_debug_processing+0x111/0x300
> >> [ 67.096076][ C1] free_to_partial_list+0x62/0x440
> >> [ 67.101664][ C1] ? free_deferred_objects+0x3e/0x110
> >> [ 67.104785][ C1] __slab_free+0x2b7/0x5d0
> >> [ 67.105539][ C1] ? free_deferred_objects+0x3e/0x110
> >> [ 67.106362][ C1] ? rcu_is_watching+0x3f/0x80
> >> [ 67.107090][ C1] free_deferred_objects+0x4d/0x110
> >
> > Hmm... did we somehow clear wrong FP or is the freepointer set again
> > after we cleared it?
Powered by blists - more mailing lists