[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <ZhMCvynFUDr-8DpX@localhost.localdomain>
Date: Sun, 7 Apr 2024 22:31:59 +0200
From: Oscar Salvador <osalvador@...e.de>
To: Peter Xu <peterx@...hat.com>
Cc: Andrew Morton <akpm@...ux-foundation.org>, linux-kernel@...r.kernel.org,
linux-mm@...ck.org, Miaohe Lin <linmiaohe@...wei.com>,
David Hildenbrand <david@...hat.com>, stable@...r.kernel.org,
Tony Luck <tony.luck@...el.com>,
Naoya Horiguchi <naoya.horiguchi@....com>
Subject: Re: [PATCH] mm,swapops: Update check in is_pfn_swap_entry for
hwpoison entries
> Totally unexpected, as this commit even removed hwpoison_entry_to_pfn().
> Obviously even until now I assumed hwpoison is accounted as pfn swap entry
> but it's just missing..
>
> Since this commit didn't really change is_pfn_swap_entry() itself, I was
> thinking maybe an older fix tag would apply, but then I noticed the old
> code indeed should work well even if hwpoison entry is missing. For
> example, it's a grey area on whether a hwpoisoned page should be accounted
> in smaps. So I think the Fixes tag is correct, and thanks for fixing this.
>
> Reviewed-by: Peter Xu <peterx@...hat.com>
Thanks Peter
> Fedora stopped having DEBUG_VM for some time, but not sure about when it's
> still in the 6.1 trees. It looks like cc stable is still reasonable from
> that regard.
Good to know, thanks for the info.
> A side note is that when I'm looking at this, I went back and see why in
> some cases we need the pfn maintained for the poisoned, then I saw the only
> user is check_hwpoisoned_entry() who wants to do fast kills in some
> contexts and that includes a double check on the pfns in a poisoned entry.
> Then afaict this path is just too rarely used and buggy.
Yes, unfortunately memory-failure code does not get exercised that much,
and so there might be subtly bugs lurking in there for quite some time.
> A few things we may need fixing, maybe someone in the loop would have time
> to have a look:
>
> - check_hwpoisoned_entry()
> - pte_none check is missing
> - all the rest swap types are missing (e.g., we want to kill the proc too
> if the page is during migration)
> - check_hwpoisoned_pmd_entry()
> - need similar care like above (pmd_none is covered not others)
I will have a look and see what needs fixing, thanks for bringing it up.
--
Oscar Salvador
SUSE Labs
Powered by blists - more mailing lists