lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20250418195013.GA877644@cmpxchg.org>
Date: Fri, 18 Apr 2025 15:50:13 -0400
From: Johannes Weiner <hannes@...xchg.org>
To: Muchun Song <songmuchun@...edance.com>
Cc: mhocko@...nel.org, roman.gushchin@...ux.dev, shakeel.butt@...ux.dev,
	muchun.song@...ux.dev, akpm@...ux-foundation.org,
	david@...morbit.com, zhengqi.arch@...edance.com,
	yosry.ahmed@...ux.dev, nphamcs@...il.com, chengming.zhou@...ux.dev,
	linux-kernel@...r.kernel.org, cgroups@...r.kernel.org,
	linux-mm@...ck.org, hamzamahfooz@...ux.microsoft.com,
	apais@...ux.microsoft.com
Subject: Re: [PATCH RFC 06/28] mm: thp: introduce folio_split_queue_lock and
 its variants

On Tue, Apr 15, 2025 at 10:45:10AM +0800, Muchun Song wrote:
> @@ -4202,7 +4248,7 @@ static unsigned long deferred_split_scan(struct shrinker *shrink,
>  		if (!--sc->nr_to_scan)
>  			break;
>  	}
> -	spin_unlock_irqrestore(&ds_queue->split_queue_lock, flags);
> +	split_queue_unlock_irqrestore(ds_queue, flags);
>  
>  	list_for_each_entry_safe(folio, next, &list, _deferred_list) {
>  		bool did_split = false;
> @@ -4251,7 +4297,7 @@ static unsigned long deferred_split_scan(struct shrinker *shrink,
>  	spin_lock_irqsave(&ds_queue->split_queue_lock, flags);
>  	list_splice_tail(&list, &ds_queue->split_queue);
>  	ds_queue->split_queue_len -= removed;
> -	spin_unlock_irqrestore(&ds_queue->split_queue_lock, flags);
> +	split_queue_unlock_irqrestore(ds_queue, flags);

These just tripped up in my testing. You use the new helpers for
unlock, but not for the lock path. That's fine in this patch, but when
"mm: thp: prepare for reparenting LRU pages for split queue lock" adds
the rcu locking to the helpers, it results in missing rcu read locks:

[  108.814880]
[  108.816378] =====================================
[  108.821069] WARNING: bad unlock balance detected!
[  108.825762] 6.15.0-rc2-00028-g570c8034f057 #192 Not tainted
[  108.831323] -------------------------------------
[  108.836016] cc1/2031 is trying to release lock (rcu_read_lock) at:
[  108.842181] [<ffffffff815f9d05>] deferred_split_scan+0x235/0x4b0
[  108.848179] but there are no more locks to release!
[  108.853046]
[  108.853046] other info that might help us debug this:
[  108.859553] 2 locks held by cc1/2031:
[  108.863211]  #0: ffff88801ddbbd88 (vm_lock){....}-{0:0}, at: do_user_addr_fault+0x19c/0x6b0
[  108.871544]  #1: ffffffff83042400 (fs_reclaim){....}-{0:0}, at: __alloc_pages_slowpath.constprop.0+0x337/0xf20
[  108.881511]
[  108.881511] stack backtrace:
[  108.885862] CPU: 4 UID: 0 PID: 2031 Comm: cc1 Not tainted 6.15.0-rc2-00028-g570c8034f057 #192 PREEMPT(voluntary)
[  108.885865] Hardware name: Micro-Star International Co., Ltd. MS-7B98/Z390-A PRO (MS-7B98), BIOS 1.80 12/25/2019
[  108.885866] Call Trace:
[  108.885867]  <TASK>
[  108.885868]  dump_stack_lvl+0x57/0x80
[  108.885871]  ? deferred_split_scan+0x235/0x4b0
[  108.885874]  print_unlock_imbalance_bug.part.0+0xfb/0x110
[  108.885877]  ? deferred_split_scan+0x235/0x4b0
[  108.885878]  lock_release+0x258/0x3e0
[  108.885880]  ? deferred_split_scan+0x85/0x4b0
[  108.885881]  deferred_split_scan+0x23a/0x4b0
[  108.885885]  ? find_held_lock+0x32/0x80
[  108.885886]  ? local_clock_noinstr+0x9/0xd0
[  108.885887]  ? lock_release+0x17e/0x3e0
[  108.885889]  do_shrink_slab+0x155/0x480
[  108.885891]  shrink_slab+0x33c/0x480
[  108.885892]  ? shrink_slab+0x1c1/0x480
[  108.885893]  shrink_node+0x324/0x840
[  108.885895]  do_try_to_free_pages+0xdf/0x550
[  108.885897]  try_to_free_pages+0xeb/0x260
[  108.885899]  __alloc_pages_slowpath.constprop.0+0x35c/0xf20
[  108.885901]  __alloc_frozen_pages_noprof+0x339/0x360
[  108.885903]  __folio_alloc_noprof+0x10/0x90
[  108.885904]  __handle_mm_fault+0xca5/0x1930
[  108.885906]  handle_mm_fault+0xb6/0x310
[  108.885908]  do_user_addr_fault+0x21e/0x6b0
[  108.885910]  exc_page_fault+0x62/0x1d0
[  108.885911]  asm_exc_page_fault+0x22/0x30
[  108.885912] RIP: 0033:0xf64890
[  108.885914] Code: 4e 64 31 d2 b9 01 00 00 00 31 f6 4c 89 45 98 e8 66 b3 88 ff 4c 8b 45 98 bf 28 00 00 00 b9 08 00 00 00 49 8b 70 18 48 8b 56 58 <48> 89 10 48 8b 13 48 89 46 58 c7 46 60 00 00 00 00 e9 62 01 00 00
[  108.885915] RSP: 002b:00007ffcf3c7d920 EFLAGS: 00010206
[  108.885916] RAX: 00007f7bf07c5000 RBX: 00007ffcf3c7d9a0 RCX: 0000000000000008
[  108.885917] RDX: 00007f7bf06aa000 RSI: 00007f7bf09dd400 RDI: 0000000000000028
[  108.885917] RBP: 00007ffcf3c7d990 R08: 00007f7bf080c540 R09: 0000000000000007
[  108.885918] R10: 000000000000009a R11: 000000003e969900 R12: 00007f7bf07bbe70
[  108.885918] R13: 0000000000000000 R14: 00007f7bf07bbec0 R15: 00007ffcf3c7d930
[  108.885920]  </TASK>

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ