[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <a683e199-ce8a-4534-a21e-65f2528415a6@redhat.com>
Date: Tue, 13 Feb 2024 10:25:22 +0100
From: David Hildenbrand <david@...hat.com>
To: Charan Teja Kalla <quic_charante@...cinc.com>,
gregkh@...uxfoundation.org, akpm@...ux-foundation.org, willy@...radead.org,
vbabka@...e.cz, dhowells@...hat.com, surenb@...gle.com
Cc: linux-mm@...ck.org, linux-kernel@...r.kernel.org,
# see patch description <stable@...r.kernel.org>
Subject: Re: [PATCH] mm/huge_memory: fix swap entry values of tail pages of
THP
On 13.02.24 09:48, Charan Teja Kalla wrote:
> An anon THP page is first added to swap cache before reclaiming it.
> Initially, each tail page contains the proper swap entry value(stored in
> ->private field) which is filled from add_to_swap_cache(). After
This is a stable-only fix? In that case, it make sense to indicate that
in the patch subject [PATCH STABLE vX.Y].
But it's always odd to have stable-only fixes, the docs [1] don't cover
that (maybe they should? or I missed it).
[1] https://www.kernel.org/doc/Documentation/process/stable-kernel-rules.rst
So we are migrating a THP that was added to the swapcache. Do you have a
reproducer?
> migrating the THP page sitting on the swap cache, only the swap entry of
> the head page is filled(see folio_migrate_mapping()).
>
> Now when this page is tried to split(one case is when this page is again
> migrated, see migrate_pages()->try_split_thp()), the tail pages
> ->private is not stored with proper swap entry values. When this tail
> page is now try to be freed, as part of it delete_from_swap_cache() is
> called which operates on the wrong swap cache index and eventually
> replaces the wrong swap cache index with shadow/NULL value, frees the
> page.
But what if we don't split the THP after migration. Is there anything
else that can go wrong?
It's sufficient to look where upstream calls page_swap_entry():
For example, do_swap_page() will never be able to make progress, because
the swap entry of the page does not correspond to the swap entry in the
PTE? It can easily fault on a swap PTE that refers a THP subpage in the
swapcache.
So unless I am missing something, only fixing this up during the split
is insufficient. You have to fix it up during migration.
>
> This leads to the state with a swap cache containing the freed page.
> This issue can manifest in many forms and the most common thing observed
> is the rcu stall during the swapin (see mapping_get_entry()).
>
> On the recent kernels, this issues is indirectly getting fixed with the
> series[1], to be specific[2].
>
> When tried to back port this series, it is observed many merge
> conflicts and also seems dependent on many other changes. As backporting
> to LTS branches is not a trivial one, the similar change from [2] is
> picked as a fix.
The fix is in
commit cfeed8ffe55b37fa10286aaaa1369da00cb88440
Author: David Hildenbrand <david@...hat.com>
Date: Mon Aug 21 18:08:46 2023 +0200
mm/swap: stop using page->private on tail pages for THP_SWAP
>
> [1] https://lore.kernel.org/all/20230821160849.531668-1-david@redhat.com/
> [2] https://lore.kernel.org/all/20230821160849.531668-5-david@redhat.com/
>
> Closes: https://lore.kernel.org/linux-mm/69cb784f-578d-ded1-cd9f-c6db04696336@quicinc.com/
> Fixes: 3417013e0d18 ("mm/migrate: Add folio_migrate_mapping()")
> Cc: <stable@...r.kernel.org> # see patch description, applicable to <=6.1
> Signed-off-by: Charan Teja Kalla <quic_charante@...cinc.com>
3417013e0d18 went into 5.16.
cfeed8ffe55b3 went into 6.6.
So only 6.1 is affected.
Isn't there a way to bite the bullet and backport that series to 6.1
instead?
--
Cheers,
David / dhildenb
Powered by blists - more mailing lists