lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <C4ECCA8C-FBF6-41AA-9877-4376EC94E021@nvidia.com>
Date: Sun, 27 Oct 2024 16:06:15 -0400
From: Zi Yan <ziy@...dia.com>
To: Hugh Dickins <hughd@...gle.com>
Cc: Andrew Morton <akpm@...ux-foundation.org>,
 Usama Arif <usamaarif642@...il.com>, Yang Shi <shy828301@...il.com>,
 Wei Yang <richard.weiyang@...il.com>,
 "Kirill A. Shutemov" <kirill.shutemov@...ux.intel.com>,
 Matthew Wilcox <willy@...radead.org>, David Hildenbrand <david@...hat.com>,
 Johannes Weiner <hannes@...xchg.org>,
 Baolin Wang <baolin.wang@...ux.alibaba.com>, Barry Song <baohua@...nel.org>,
 Kefeng Wang <wangkefeng.wang@...wei.com>,
 Ryan Roberts <ryan.roberts@....com>, Nhat Pham <nphamcs@...il.com>,
 Chris Li <chrisl@...nel.org>, Shakeel Butt <shakeel.butt@...ux.dev>,
 linux-kernel@...r.kernel.org, linux-mm@...ck.org
Subject: Re: [PATCH hotfix v2 1/2] mm/thp: fix deferred split queue not
 partially_mapped

On 27 Oct 2024, at 15:59, Hugh Dickins wrote:

> Recent changes are putting more pressure on THP deferred split queues:
> under load revealing long-standing races, causing list_del corruptions,
> "Bad page state"s and worse (I keep BUGs in both of those, so usually
> don't get to see how badly they end up without).  The relevant recent
> changes being 6.8's mTHP, 6.10's mTHP swapout, and 6.12's mTHP swapin,
> improved swap allocation, and underused THP splitting.
>
> The new unlocked list_del_init() in deferred_split_scan() is buggy.
> I gave bad advice, it looks plausible since that's a local on-stack
> list, but the fact is that it can race with a third party freeing or
> migrating the preceding folio (properly unqueueing it with refcount 0
> while holding split_queue_lock), thereby corrupting the list linkage.
>
> The obvious answer would be to take split_queue_lock there: but it has
> a long history of contention, so I'm reluctant to add to that. Instead,
> make sure that there is always one safe (raised refcount) folio before,
> by delaying its folio_put().  (And of course I was wrong to suggest
> updating split_queue_len without the lock: leave that until the splice.)
>
> And remove two over-eager partially_mapped checks, restoring those tests
> to how they were before: if uncharge_folio() or free_tail_page_prepare()
> finds _deferred_list non-empty, it's in trouble whether or not that folio
> is partially_mapped (and the flag was already cleared in the latter case).
>
> Fixes: dafff3f4c850 ("mm: split underused THPs")
> Signed-off-by: Hugh Dickins <hughd@...gle.com>
> Acked-by: Usama Arif <usamaarif642@...il.com>
> Reviewed-by: David Hildenbrand <david@...hat.com>
> Reviewed-by: Baolin Wang <baolin.wang@...ux.alibaba.com>
> ---
> Based on 6.12-rc4
> v2: added ack and reviewed-bys

Acked-by: Zi Yan <ziy@...dia.com>

--
Best Regards,
Yan, Zi

Download attachment "signature.asc" of type "application/pgp-signature" (855 bytes)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ