[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <ZhZTZtzyMpMMowoD@localhost.localdomain>
Date: Wed, 10 Apr 2024 10:52:54 +0200
From: Oscar Salvador <osalvador@...e.de>
To: Miaohe Lin <linmiaohe@...wei.com>
Cc: akpm@...ux-foundation.org, naoya.horiguchi@....com, linux-mm@...ck.org,
linux-kernel@...r.kernel.org
Subject: Re: [PATCH] mm/memory-failure: fix deadlock when
hugetlb_optimize_vmemmap is enabled
On Wed, Apr 10, 2024 at 03:52:14PM +0800, Miaohe Lin wrote:
> AFAICS, iff check_pages_enabled static key is enabled and in hard offline mode,
> check_new_pages() will prevent those pages from ending up in a PCP queue again
> when refilling PCP list. Because PageHWPoison pages will be taken as 'bad' pages
> and skipped when refill PCP list.
Yes, but check_pages_enabled static key is only enabled when
either CONFIG_DEBUG_PAGEALLOC or CONFIG_DEBUG_VM are set, which means
that under most of the systems that protection will not take place.
Which takes me to a problem we had in the past where we were handing
over hwpoisoned pages from PCP lists happily.
Now, with for soft-offline mode, we worked hard to stop doing that
because soft-offline is a non-disruptive operation and no one should get
killed.
hard-offline is another story, but still I think that extending the
comment to include the following would be a good idea:
"Disabling pcp before dissolving the page was a deterministic approach
because we made sure that those pages cannot end up in any PCP list.
Draining PCP lists expels those pages to the buddy system, but nothing
guarantees that those pages do not get back to a PCP queue if we need
to refill those."
Just to remind ourselves of the dangers of a non-deterministic
approach.
Thanks
--
Oscar Salvador
SUSE Labs
Powered by blists - more mailing lists