lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <ZhZTZtzyMpMMowoD@localhost.localdomain>
Date: Wed, 10 Apr 2024 10:52:54 +0200
From: Oscar Salvador <osalvador@...e.de>
To: Miaohe Lin <linmiaohe@...wei.com>
Cc: akpm@...ux-foundation.org, naoya.horiguchi@....com, linux-mm@...ck.org,
	linux-kernel@...r.kernel.org
Subject: Re: [PATCH] mm/memory-failure: fix deadlock when
 hugetlb_optimize_vmemmap is enabled

On Wed, Apr 10, 2024 at 03:52:14PM +0800, Miaohe Lin wrote:
> AFAICS, iff check_pages_enabled static key is enabled and in hard offline mode,
> check_new_pages() will prevent those pages from ending up in a PCP queue again
> when refilling PCP list. Because PageHWPoison pages will be taken as 'bad' pages
> and skipped when refill PCP list.

Yes, but check_pages_enabled static key is only enabled when
either CONFIG_DEBUG_PAGEALLOC or CONFIG_DEBUG_VM are set, which means
that under most of the systems that protection will not take place.

Which takes me to a problem we had in the past where we were handing
over hwpoisoned pages from PCP lists happily.
Now, with for soft-offline mode, we worked hard to stop doing that
because soft-offline is a non-disruptive operation and no one should get 
killed.
hard-offline is another story, but still I think that extending the
comment to include the following would be a good idea:

"Disabling pcp before dissolving the page was a deterministic approach
 because we made sure that those pages cannot end up in any PCP list.
 Draining PCP lists expels those pages to the buddy system, but nothing
 guarantees that those pages do not get back to a PCP queue if we need
 to refill those."

 Just to remind ourselves of the dangers of a non-deterministic
 approach.


Thanks


-- 
Oscar Salvador
SUSE Labs

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ