[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <aTsvH422vqivZs8g@fedora>
Date: Thu, 11 Dec 2025 12:52:47 -0800
From: "Vishal Moola (Oracle)" <vishal.moola@...il.com>
To: Oliver Sang <oliver.sang@...el.com>
Cc: Uladzislau Rezki <urezki@...il.com>, oe-lkp@...ts.linux.dev,
lkp@...el.com, linux-kernel@...r.kernel.org,
Andrew Morton <akpm@...ux-foundation.org>, linux-mm@...ck.org
Subject: Re: [linus:master] [mm/vmalloc] a061578043:
BUG:spinlock_trylock_failure_on_UP_on_CPU
On Thu, Dec 11, 2025 at 09:40:07PM +0800, Oliver Sang wrote:
> hi, Uladzislau Rezki,
>
> On Thu, Dec 11, 2025 at 11:09:43AM +0100, Uladzislau Rezki wrote:
> > On Wed, Dec 10, 2025 at 03:30:51PM -0800, Vishal Moola (Oracle) wrote:
> > > On Wed, Dec 10, 2025 at 02:10:28PM +0800, kernel test robot wrote:
> > > >
> > > >
> > > > Hello,
> > > >
> > > > kernel test robot noticed "BUG:spinlock_trylock_failure_on_UP_on_CPU" on:
> > > >
> > > > commit: a0615780439938e8e61343f1f92a4c54a71dc6a5 ("mm/vmalloc: request large order pages from buddy allocator")
> > > > https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master
> > > >
> > > > [test failed on linus/master cb015814f8b6eebcbb8e46e111d108892c5e6821]
> > > > [test failed on linux-next/master c75caf76ed86bbc15a72808f48f8df1608a0886c]
> > > >
> > > > in testcase: trinity
> > > > version:
> > > > with following parameters:
> > > >
> > > > runtime: 300s
> > > > group: group-03
> > > > nr_groups: 5
> > > >
> > > >
> > > >
> > > > config: x86_64-randconfig-011-20251207
> > > > compiler: clang-20
> > > > test machine: qemu-system-x86_64 -enable-kvm -cpu SandyBridge -smp 2 -m 32G
> > > >
> > > > (please refer to attached dmesg/kmsg for entire log/backtrace)
> > > >
> > > >
> > > > the issue show randomly (~50%) in tests.
> > > >
> > > > 645a3c4243473d5c a0615780439938e8e61343f1f92
> > > > ---------------- ---------------------------
> > > > fail:runs %reproduction fail:runs
> > > > | | |
> > > > :60 50% 29:60 dmesg.BUG:spinlock_trylock_failure_on_UP_on_CPU
> > > > :60 50% 29:60 dmesg.RIP:_raw_spin_unlock_irqrestore
> > > >
> > > >
> > > >
> > > > If you fix the issue in a separate patch/commit (i.e. not just a new version of
> > > > the same patch/commit), kindly add following tags
> > > > | Reported-by: kernel test robot <oliver.sang@...el.com>
> > > > | Closes: https://lore.kernel.org/oe-lkp/202512101320.e2f2dd6f-lkp@intel.com
> > > >
> > > >
> > > > [ 1046.632156][ C0] BUG: spinlock trylock failure on UP on CPU#0, kcompactd0/28
> > > > [ 1046.633368][ C0] lock: 0xffff888807e35ef0, .magic: dead4ead, .owner: kcompactd0/28, .owner_cpu: 0
> > > > [ 1046.634872][ C0] CPU: 0 UID: 0 PID: 28 Comm: kcompactd0 Not tainted 6.18.0-rc5-00127-ga06157804399 #1 PREEMPT 8cc09ef94dcec767faa911515ce9e609c45db470
> > > > [ 1046.637019][ C0] Call Trace:
> > > > [ 1046.637563][ C0] <IRQ>
> > > > [ 1046.638038][ C0] __dump_stack (lib/dump_stack.c:95)
> > > > [ 1046.638781][ C0] dump_stack_lvl (lib/dump_stack.c:123)
> > > > [ 1046.639512][ C0] dump_stack (lib/dump_stack.c:130)
> > > > [ 1046.640168][ C0] spin_dump (kernel/locking/spinlock_debug.c:71)
> > > > [ 1046.640853][ C0] do_raw_spin_trylock (kernel/locking/spinlock_debug.c:?)
> > > > [ 1046.641678][ C0] _raw_spin_trylock (include/linux/spinlock_api_smp.h:89 kernel/locking/spinlock.c:138)
> > > > [ 1046.642473][ C0] __free_frozen_pages (mm/page_alloc.c:2973)
> > > > [ 1046.643279][ C0] ___free_pages (mm/page_alloc.c:5295)
> > > > [ 1046.643956][ C0] __free_pages (mm/page_alloc.c:5334)
> > > > [ 1046.644624][ C0] tlb_remove_table_rcu (include/linux/mm.h:? include/linux/mm.h:3122 include/asm-generic/tlb.h:220 mm/mmu_gather.c:227 mm/mmu_gather.c:290)
> > > > [ 1046.645520][ C0] ? __cfi_tlb_remove_table_rcu (mm/mmu_gather.c:289)
> > > > [ 1046.646384][ C0] ? rcu_core (kernel/rcu/tree.c:?)
> > > > [ 1046.647092][ C0] rcu_core (include/linux/rcupdate.h:341 kernel/rcu/tree.c:2607 kernel/rcu/tree.c:2861)
> > > > [ 1046.647774][ C0] rcu_core_si (kernel/rcu/tree.c:2879)
> > > > [ 1046.648439][ C0] handle_softirqs (arch/x86/include/asm/jump_label.h:36 include/trace/events/irq.h:142 kernel/softirq.c:623)
> > > > [ 1046.649202][ C0] __irq_exit_rcu (arch/x86/include/asm/jump_label.h:36 kernel/softirq.c:725)
> > > > [ 1046.649919][ C0] irq_exit_rcu (kernel/softirq.c:741)
> > > > [ 1046.650593][ C0] sysvec_apic_timer_interrupt (arch/x86/kernel/apic/apic.c:1052)
> > > > [ 1046.651520][ C0] </IRQ>
> > > > [ 1046.651984][ C0] <TASK>
> > > > [ 1046.652466][ C0] asm_sysvec_apic_timer_interrupt (arch/x86/include/asm/idtentry.h:697)
> > > > [ 1046.653389][ C0] RIP: 0010:_raw_spin_unlock_irqrestore (arch/x86/include/asm/preempt.h:95 include/linux/spinlock_api_smp.h:152 kernel/locking/spinlock.c:194)
> > > > [ 1046.654391][ C0] Code: 00 44 89 f6 c1 ee 09 48 c7 c7 e0 f2 7e 86 31 d2 31 c9 e8 e8 dd 80 fd 4d 85 f6 74 05 e8 de e5 fd ff 0f ba e3 09 73 01 fb 31 f6 <ff> 0d 2f dc 6f 01 0f 95 c3 40 0f 94 c6 48 c7 c7 10 f3 7e 86 31 d2
> > > > All code
> > > > ========
> > > > 0: 00 44 89 f6 add %al,-0xa(%rcx,%rcx,4)
> > > > 4: c1 ee 09 shr $0x9,%esi
> > > > 7: 48 c7 c7 e0 f2 7e 86 mov $0xffffffff867ef2e0,%rdi
> > > > e: 31 d2 xor %edx,%edx
> > > > 10: 31 c9 xor %ecx,%ecx
> > > > 12: e8 e8 dd 80 fd call 0xfffffffffd80ddff
> > > > 17: 4d 85 f6 test %r14,%r14
> > > > 1a: 74 05 je 0x21
> > > > 1c: e8 de e5 fd ff call 0xfffffffffffde5ff
> > > > 21: 0f ba e3 09 bt $0x9,%ebx
> > > > 25: 73 01 jae 0x28
> > > > 27: fb sti
> > > > 28: 31 f6 xor %esi,%esi
> > > > 2a:* ff 0d 2f dc 6f 01 decl 0x16fdc2f(%rip) # 0x16fdc5f <-- trapping instruction
> > > > 30: 0f 95 c3 setne %bl
> > > > 33: 40 0f 94 c6 sete %sil
> > > > 37: 48 c7 c7 10 f3 7e 86 mov $0xffffffff867ef310,%rdi
> > > > 3e: 31 d2 xor %edx,%edx
> > > >
> > > > Code starting with the faulting instruction
> > > > ===========================================
> > > > 0: ff 0d 2f dc 6f 01 decl 0x16fdc2f(%rip) # 0x16fdc35
> > > > 6: 0f 95 c3 setne %bl
> > > > 9: 40 0f 94 c6 sete %sil
> > > > d: 48 c7 c7 10 f3 7e 86 mov $0xffffffff867ef310,%rdi
> > > > 14: 31 d2 xor %edx,%edx
> > > > [ 1046.657511][ C0] RSP: 0000:ffffc900001cfb50 EFLAGS: 00000246
> > > > [ 1046.658482][ C0] RAX: 0000000000000000 RBX: 0000000000000206 RCX: 0000000000000000
> > > > [ 1046.659740][ C0] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000
> > > > [ 1046.660979][ C0] RBP: ffffc900001cfb68 R08: 0000000000000000 R09: 0000000000000000
> > > > [ 1046.662239][ C0] R10: 0000000000000000 R11: 0000000000000000 R12: ffff888807e35f50
> > > > [ 1046.663505][ C0] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
> > > > [ 1046.664741][ C0] free_pcppages_bulk (mm/page_alloc.c:1494)
> > > > [ 1046.665618][ C0] drain_pages_zone (include/linux/spinlock.h:391 mm/page_alloc.c:2632)
> > > > [ 1046.666374][ C0] __drain_all_pages (mm/page_alloc.c:2731)
> > > > [ 1046.667171][ C0] drain_all_pages (mm/page_alloc.c:2747)
> > > > [ 1046.667908][ C0] kcompactd (mm/compaction.c:3115)
> > > > [ 1046.668625][ C0] kthread (kernel/kthread.c:465)
> > > > [ 1046.669299][ C0] ? __cfi_kcompactd (mm/compaction.c:3166)
> > > > [ 1046.670046][ C0] ? __cfi_kthread (kernel/kthread.c:412)
> > > > [ 1046.670764][ C0] ret_from_fork (arch/x86/kernel/process.c:164)
> > > > [ 1046.671483][ C0] ? __cfi_kthread (kernel/kthread.c:412)
> > > > [ 1046.672174][ C0] ret_from_fork_asm (arch/x86/entry/entry_64.S:255)
> > > > [ 1046.672936][ C0] </TASK>
> > > >
> > > >
> > > >
> > > > The kernel config and materials to reproduce are available at:
> > > > https://download.01.org/0day-ci/archive/20251210/202512101320.e2f2dd6f-lkp@intel.com
> > > >
> > >
> > > Hmmm. This looks like a race condition tied to reclaim. I'm assuming
> > > we fail to allocate a page and kick off kswapd. Then when we fall back
> > > to the bulk allocator which tries to remove a pcp page at the same time as
> > > kswapd tries to reclaim it. Maybe?
> > >
> > > Does something like this fix it?
>
> below patch is failed to applied upon a061578043.
>
> since before reporting, we also tested on latest mainline tip at that time:
> [test failed on linus/master cb015814f8b6eebcbb8e46e111d108892c5e6821]
> which still can reproduce the issue, we applied below patch upon this commit.
>
>
> we found the issue still exists, though the rate seems drop, we only reproduced
> the issue 4 times out of 100 runs.
>
> one dmesg attached FYI.
>
>
> [ 448.229620][ C0] BUG: spinlock trylock failure on UP on CPU#0, kcompactd0/28
> [ 448.230037][ C0] lock: 0xffff888807e75f08, .magic: dead4ead, .owner: kcompactd0/28, .owner_cpu: 0
> [ 448.230474][ C0] CPU: 0 UID: 0 PID: 28 Comm: kcompactd0 Not tainted 6.18.0-12693-g5c827f68ce27 #1 PREEMPT b3721ff206516c93c37a896ae4d3df039c2d04c2
> [ 448.231106][ C0] Call Trace:
> [ 448.231264][ C0] <IRQ>
> [ 448.231403][ C0] __dump_stack+0x19/0x40
> [ 448.231627][ C0] dump_stack_lvl+0x36/0xb8
> [ 448.231842][ C0] dump_stack+0x10/0x38
> [ 448.232101][ C0] spin_dump+0x11c/0x1c0
> [ 448.232404][ C0] do_raw_spin_trylock+0xe4/0x138
> [ 448.232758][ C0] _raw_spin_trylock+0x1a/0xb8
> [ 448.233104][ C0] __free_frozen_pages+0x750/0xa80
> [ 448.233456][ C0] ___free_pages+0x67/0x140
> [ 448.233765][ C0] __free_pages+0x10/0x38
> [ 448.234066][ C0] __tlb_remove_table+0x16a/0x200
> [ 448.234412][ C0] tlb_remove_table_rcu+0x77/0xf8
> [ 448.234771][ C0] ? __cfi_tlb_remove_table_rcu+0x8/0x8
> [ 448.235153][ C0] ? rcu_core+0x8af/0x1740
> [ 448.235456][ C0] rcu_core+0xa25/0x1740
> [ 448.235751][ C0] rcu_core_si+0xe/0x38
> [ 448.236038][ C0] handle_softirqs+0x1db/0x700
> [ 448.236371][ C0] __irq_exit_rcu+0x44/0xc0
> [ 448.236678][ C0] irq_exit_rcu+0xe/0x38
> [ 448.236968][ C0] sysvec_apic_timer_interrupt+0x78/0xb8
> [ 448.237382][ C0] </IRQ>
> [ 448.237588][ C0] <TASK>
> [ 448.237798][ C0] asm_sysvec_apic_timer_interrupt+0x1b/0x40
> [ 448.238214][ C0] RIP: 0010:_raw_spin_unlock_irqrestore+0x6b/0xb8
> [ 448.238575][ C0] Code: 00 44 89 f6 c1 ee 09 48 c7 c7 60 91 80 86 31 d2 31 c9 e8 e8 b4 86 fd 4d 85 f6 74 05 e8 9e e9 fd ff 0f ba e3 09 73 01 fb 31 f6 <ff> 0d ef b8 75 01 0f 95 c3 40 0f 94 c6 48 c7 c7 90 91 80 86 31 d2
> [ 448.239787][ C0] RSP: 0000:ffffc900001cfb48 EFLAGS: 00000246
> [ 448.240187][ C0] RAX: 0000000000000000 RBX: 0000000000000206 RCX: 0000000000000000
> [ 448.240700][ C0] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000
> [ 448.241230][ C0] RBP: ffffc900001cfb60 R08: 0000000000000000 R09: 0000000000000000
> [ 448.241758][ C0] R10: 0000000000000000 R11: 0000000000000000 R12: ffff888807e75fc8
> [ 448.242289][ C0] R13: 0000000000000006 R14: 0000000000000000 R15: ffffea000cee6410
> [ 448.242840][ C0] ? _raw_spin_unlock_irqrestore+0x58/0xb8
> [ 448.243241][ C0] free_pcppages_bulk+0x4a2/0x500
> [ 448.243591][ C0] drain_pages_zone+0xa5/0x140
> [ 448.243936][ C0] __drain_all_pages+0x1ab/0x240
> [ 448.244277][ C0] drain_all_pages+0x10/0x38
> [ 448.244587][ C0] kcompactd+0x61f/0xaf8
> [ 448.244900][ C0] kthread+0x586/0x678
> [ 448.245183][ C0] ? __cfi_kcompactd+0x8/0x8
> [ 448.245504][ C0] ? __cfi_kthread+0x8/0x8
> [ 448.245815][ C0] ret_from_fork+0x24a/0x478
> [ 448.246138][ C0] ? __cfi_kthread+0x8/0x8
> [ 448.246453][ C0] ret_from_fork_asm+0x11/0x40
> [ 448.246795][ C0] </TASK>
> [ 693.283069][ T328] hwclock: can't open '/dev/misc/rtc': No such file or directory
> LKP: ttyS0: 289: LKP: tbox cant kexec and rebooting forcely
> [ 701.346273][ T289] sysrq: Emergency Sync
> [ 701.346778][ T10] Emergency Sync complete
> [ 701.347270][ T289] sysrq: Resetting
>
>
> > >
> > > diff --git a/mm/vmalloc.c b/mm/vmalloc.c
> > > index ecbac900c35f..0d1480723ddc 100644
> > > --- a/mm/vmalloc.c
> > > +++ b/mm/vmalloc.c
> > > @@ -3634,7 +3634,7 @@ vm_area_alloc_pages(gfp_t gfp, int nid,
> > > struct page *page;
> > > int i;
> > > unsigned int large_order = ilog2(nr_remaining);
> > > - gfp_t large_gfp = vmalloc_gfp_adjust(gfp, large_order) & ~__GFP_DIRECT_RECLAIM;
> > > + gfp_t large_gfp = vmalloc_gfp_adjust(gfp, large_order) & ~__GFP_RECLAIM;
> > >
> > > large_order = min(max_attempt_order, large_order);
> > >
> > Some thoughts.
> >
> > That trace is not easy to follow. As for this change, probably it
> > is worth to go with anyway. Because, if we are under a low memory
> > condition, it is quite common for Android devices, we do not want
> > to kick frequently kswapd without good reason. We have a fallback
> > to single page allocator.
Yeah. AFAICT, This trace doesn't seem to point at this patch:
__free_frozen_pages+0x750/0xa80 is:
pcp = pcp_spin_trylock(zone->per_cpu_pageset, UP_flags);
if (pcp) {
if (!free_frozen_page_commit(zone, pcp, page, migratetype,
order, fpi_flags, &UP_flags))
return;
------------> pcp_spin_unlock(pcp, UP_flags);
free_pcppages_bulk+0x4a2/0x500 is:
spin_unlock_irqrestore(&zone->lock, flags);
Seeing how masking off reclaim reduced the occurence from 50% -> 5%,
Maybe its some underlying race condition that is encouraged by the
patch? I'm not sure how often people are kicking of kswapd then
immediately calling the bulk allocator...
Powered by blists - more mailing lists