[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-Id: <555DF274-355C-4D96-A71F-8E74436D5587@linux.dev>
Date: Sat, 19 Apr 2025 22:20:33 +0800
From: Muchun Song <muchun.song@...ux.dev>
To: Johannes Weiner <hannes@...xchg.org>
Cc: Muchun Song <songmuchun@...edance.com>,
mhocko@...nel.org,
roman.gushchin@...ux.dev,
shakeel.butt@...ux.dev,
akpm@...ux-foundation.org,
david@...morbit.com,
zhengqi.arch@...edance.com,
yosry.ahmed@...ux.dev,
nphamcs@...il.com,
chengming.zhou@...ux.dev,
linux-kernel@...r.kernel.org,
cgroups@...r.kernel.org,
linux-mm@...ck.org,
hamzamahfooz@...ux.microsoft.com,
apais@...ux.microsoft.com
Subject: Re: [PATCH RFC 06/28] mm: thp: introduce folio_split_queue_lock and
its variants
> On Apr 19, 2025, at 03:50, Johannes Weiner <hannes@...xchg.org> wrote:
>
> On Tue, Apr 15, 2025 at 10:45:10AM +0800, Muchun Song wrote:
>> @@ -4202,7 +4248,7 @@ static unsigned long deferred_split_scan(struct shrinker *shrink,
>> if (!--sc->nr_to_scan)
>> break;
>> }
>> - spin_unlock_irqrestore(&ds_queue->split_queue_lock, flags);
>> + split_queue_unlock_irqrestore(ds_queue, flags);
>>
>> list_for_each_entry_safe(folio, next, &list, _deferred_list) {
>> bool did_split = false;
>> @@ -4251,7 +4297,7 @@ static unsigned long deferred_split_scan(struct shrinker *shrink,
>> spin_lock_irqsave(&ds_queue->split_queue_lock, flags);
>> list_splice_tail(&list, &ds_queue->split_queue);
>> ds_queue->split_queue_len -= removed;
>> - spin_unlock_irqrestore(&ds_queue->split_queue_lock, flags);
>> + split_queue_unlock_irqrestore(ds_queue, flags);
>
> These just tripped up in my testing. You use the new helpers for
> unlock, but not for the lock path. That's fine in this patch, but when
> "mm: thp: prepare for reparenting LRU pages for split queue lock" adds
> the rcu locking to the helpers, it results in missing rcu read locks:
Good catch! Thanks for pointing out. You are right, I shouldn't use the
new unlock helpers here without the corresponding new lock helpers. I'll
revert this change in this function.
Muchun,
Thanks.
>
> [ 108.814880]
> [ 108.816378] =====================================
> [ 108.821069] WARNING: bad unlock balance detected!
> [ 108.825762] 6.15.0-rc2-00028-g570c8034f057 #192 Not tainted
> [ 108.831323] -------------------------------------
> [ 108.836016] cc1/2031 is trying to release lock (rcu_read_lock) at:
> [ 108.842181] [<ffffffff815f9d05>] deferred_split_scan+0x235/0x4b0
> [ 108.848179] but there are no more locks to release!
> [ 108.853046]
> [ 108.853046] other info that might help us debug this:
> [ 108.859553] 2 locks held by cc1/2031:
> [ 108.863211] #0: ffff88801ddbbd88 (vm_lock){....}-{0:0}, at: do_user_addr_fault+0x19c/0x6b0
> [ 108.871544] #1: ffffffff83042400 (fs_reclaim){....}-{0:0}, at: __alloc_pages_slowpath.constprop.0+0x337/0xf20
> [ 108.881511]
> [ 108.881511] stack backtrace:
> [ 108.885862] CPU: 4 UID: 0 PID: 2031 Comm: cc1 Not tainted 6.15.0-rc2-00028-g570c8034f057 #192 PREEMPT(voluntary)
> [ 108.885865] Hardware name: Micro-Star International Co., Ltd. MS-7B98/Z390-A PRO (MS-7B98), BIOS 1.80 12/25/2019
> [ 108.885866] Call Trace:
> [ 108.885867] <TASK>
> [ 108.885868] dump_stack_lvl+0x57/0x80
> [ 108.885871] ? deferred_split_scan+0x235/0x4b0
> [ 108.885874] print_unlock_imbalance_bug.part.0+0xfb/0x110
> [ 108.885877] ? deferred_split_scan+0x235/0x4b0
> [ 108.885878] lock_release+0x258/0x3e0
> [ 108.885880] ? deferred_split_scan+0x85/0x4b0
> [ 108.885881] deferred_split_scan+0x23a/0x4b0
> [ 108.885885] ? find_held_lock+0x32/0x80
> [ 108.885886] ? local_clock_noinstr+0x9/0xd0
> [ 108.885887] ? lock_release+0x17e/0x3e0
> [ 108.885889] do_shrink_slab+0x155/0x480
> [ 108.885891] shrink_slab+0x33c/0x480
> [ 108.885892] ? shrink_slab+0x1c1/0x480
> [ 108.885893] shrink_node+0x324/0x840
> [ 108.885895] do_try_to_free_pages+0xdf/0x550
> [ 108.885897] try_to_free_pages+0xeb/0x260
> [ 108.885899] __alloc_pages_slowpath.constprop.0+0x35c/0xf20
> [ 108.885901] __alloc_frozen_pages_noprof+0x339/0x360
> [ 108.885903] __folio_alloc_noprof+0x10/0x90
> [ 108.885904] __handle_mm_fault+0xca5/0x1930
> [ 108.885906] handle_mm_fault+0xb6/0x310
> [ 108.885908] do_user_addr_fault+0x21e/0x6b0
> [ 108.885910] exc_page_fault+0x62/0x1d0
> [ 108.885911] asm_exc_page_fault+0x22/0x30
> [ 108.885912] RIP: 0033:0xf64890
> [ 108.885914] Code: 4e 64 31 d2 b9 01 00 00 00 31 f6 4c 89 45 98 e8 66 b3 88 ff 4c 8b 45 98 bf 28 00 00 00 b9 08 00 00 00 49 8b 70 18 48 8b 56 58 <48> 89 10 48 8b 13 48 89 46 58 c7 46 60 00 00 00 00 e9 62 01 00 00
> [ 108.885915] RSP: 002b:00007ffcf3c7d920 EFLAGS: 00010206
> [ 108.885916] RAX: 00007f7bf07c5000 RBX: 00007ffcf3c7d9a0 RCX: 0000000000000008
> [ 108.885917] RDX: 00007f7bf06aa000 RSI: 00007f7bf09dd400 RDI: 0000000000000028
> [ 108.885917] RBP: 00007ffcf3c7d990 R08: 00007f7bf080c540 R09: 0000000000000007
> [ 108.885918] R10: 000000000000009a R11: 000000003e969900 R12: 00007f7bf07bbe70
> [ 108.885918] R13: 0000000000000000 R14: 00007f7bf07bbec0 R15: 00007ffcf3c7d930
> [ 108.885920] </TASK>
Powered by blists - more mailing lists