linux-kernel - Re: [PATCH] mm/huge_memory: fix swap entry values of tail pages of THP

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <2024021309-predict-scrubber-d6f3@gregkh>
Date: Tue, 13 Feb 2024 10:41:35 +0100
From: Greg KH <gregkh@...uxfoundation.org>
To: Charan Teja Kalla <quic_charante@...cinc.com>
Cc: akpm@...ux-foundation.org, willy@...radead.org, vbabka@...e.cz,
	dhowells@...hat.com, david@...hat.com, surenb@...gle.com,
	linux-mm@...ck.org, linux-kernel@...r.kernel.org,
	# see patch description <stable@...r.kernel.org>
Subject: Re: [PATCH] mm/huge_memory: fix swap entry values of tail pages of
 THP

On Tue, Feb 13, 2024 at 02:18:10PM +0530, Charan Teja Kalla wrote:
> An anon THP page is first added to swap cache before reclaiming it.
> Initially, each tail page contains the proper swap entry value(stored in
> ->private field) which is filled from add_to_swap_cache(). After
> migrating the THP page sitting on the swap cache, only the swap entry of
> the head page is filled(see folio_migrate_mapping()).
> 
> Now when this page is tried to split(one case is when this page is again
> migrated, see migrate_pages()->try_split_thp()), the tail pages
> ->private is not stored with proper swap entry values.  When this tail
> page is now try to be freed, as part of it delete_from_swap_cache() is
> called which operates on the wrong swap cache index and eventually
> replaces the wrong swap cache index with shadow/NULL value, frees the
> page.
> 
> This leads to the state with a swap cache containing the freed page.
> This issue can manifest in many forms and the most common thing observed
> is the rcu stall during the swapin (see mapping_get_entry()).
> 
> On the recent kernels, this issues is indirectly getting fixed with the
> series[1], to be specific[2].

Then why can we not take that series?  Taking one-off patches almost
ALWAYS causes future problems, what are you going to do to prevent that
here (merge and logic problems).

> When tried to back port this series, it is observed many merge
> conflicts and also seems dependent on many other changes. As backporting
> to LTS branches is not a trivial one, the similar change from [2] is
> picked as a fix.
> 
> [1] https://lore.kernel.org/all/20230821160849.531668-1-david@redhat.com/
> [2] https://lore.kernel.org/all/20230821160849.531668-5-david@redhat.com/

Again, please try to take the original series, ESPECIALLY for stuff in
-mm which is tricky and likely to blow up in odd ways in the future.

So I will not take this unless the -mm maintainers agree it really is
the only way forward.

thanks,

greg k-h