[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <247259aa-9c78-d1ae-c829-aa72adc75922@huawei.com>
Date: Thu, 11 Apr 2024 10:26:44 +0800
From: Miaohe Lin <linmiaohe@...wei.com>
To: Oscar Salvador <osalvador@...e.de>
CC: <akpm@...ux-foundation.org>, <naoya.horiguchi@....com>,
<linux-mm@...ck.org>, <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH] mm/memory-failure: fix deadlock when
hugetlb_optimize_vmemmap is enabled
On 2024/4/10 16:52, Oscar Salvador wrote:
> On Wed, Apr 10, 2024 at 03:52:14PM +0800, Miaohe Lin wrote:
>> AFAICS, iff check_pages_enabled static key is enabled and in hard offline mode,
>> check_new_pages() will prevent those pages from ending up in a PCP queue again
>> when refilling PCP list. Because PageHWPoison pages will be taken as 'bad' pages
>> and skipped when refill PCP list.
>
> Yes, but check_pages_enabled static key is only enabled when
> either CONFIG_DEBUG_PAGEALLOC or CONFIG_DEBUG_VM are set, which means
> that under most of the systems that protection will not take place.
>
> Which takes me to a problem we had in the past where we were handing
> over hwpoisoned pages from PCP lists happily.
> Now, with for soft-offline mode, we worked hard to stop doing that
> because soft-offline is a non-disruptive operation and no one should get
> killed.
> hard-offline is another story, but still I think that extending the
> comment to include the following would be a good idea:
>
> "Disabling pcp before dissolving the page was a deterministic approach
> because we made sure that those pages cannot end up in any PCP list.
> Draining PCP lists expels those pages to the buddy system, but nothing
> guarantees that those pages do not get back to a PCP queue if we need
> to refill those."
This really helps. Will add it in v2.
Thanks Oscar.
>
> Just to remind ourselves of the dangers of a non-deterministic
> approach.
>
>
> Thanks
>
>
Powered by blists - more mailing lists