lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <aMMNVA9EXXHYvmKH@agluck-desk3>
Date: Thu, 11 Sep 2025 10:56:36 -0700
From: "Luck, Tony" <tony.luck@...el.com>
To: David Hildenbrand <david@...hat.com>
CC: Kyle Meyer <kyle.meyer@....com>, <akpm@...ux-foundation.org>,
	<corbet@....net>, <linmiaohe@...wei.com>, <shuah@...nel.org>,
	<Liam.Howlett@...cle.com>, <bp@...en8.de>, <hannes@...xchg.org>,
	<jack@...e.cz>, <jane.chu@...cle.com>, <jiaqiyan@...gle.com>,
	<joel.granados@...nel.org>, <laoar.shao@...il.com>,
	<lorenzo.stoakes@...cle.com>, <mclapinski@...gle.com>, <mhocko@...e.com>,
	<nao.horiguchi@...il.com>, <osalvador@...e.de>, <rafael.j.wysocki@...el.com>,
	<rppt@...nel.org>, <russ.anderson@....com>, <shawn.fan@...el.com>,
	<surenb@...gle.com>, <vbabka@...e.cz>, <linux-acpi@...r.kernel.org>,
	<linux-doc@...r.kernel.org>, <linux-kernel@...r.kernel.org>,
	<linux-kselftest@...r.kernel.org>, <linux-mm@...ck.org>
Subject: Re: [PATCH] mm/memory-failure: Disable soft offline for HugeTLB
 pages by default

On Thu, Sep 11, 2025 at 10:46:10AM +0200, David Hildenbrand wrote:
> On 10.09.25 18:15, Kyle Meyer wrote:
> > Soft offlining a HugeTLB page reduces the available HugeTLB page pool.
> > Since HugeTLB pages are preallocated, reducing the available HugeTLB
> > page pool can cause allocation failures.
> > 
> > /proc/sys/vm/enable_soft_offline provides a sysctl interface to
> > disable/enable soft offline:
> > 
> > 0 - Soft offline is disabled.
> > 1 - Soft offline is enabled.
> > 
> > The current sysctl interface does not distinguish between HugeTLB pages
> > and other page types.
> > 
> > Disable soft offline for HugeTLB pages by default (1) and extend the
> > sysctl interface to preserve existing behavior (2):
> > 
> > 0 - Soft offline is disabled.
> > 1 - Soft offline is enabled (excluding HugeTLB pages).
> > 2 - Soft offline is enabled (including HugeTLB pages).
> > 
> > Update documentation for the sysctl interface, reference the sysctl
> > interface in the sysfs ABI documentation, and update HugeTLB soft
> > offline selftests.
> 
> I'm sure you spotted that the documentation for
> "/sys/devices/system/memory/soft_offline_pag" resides under "testing".

But that is only one of several places in the kernel that
feed into the page offline code.

This patch was motivated by the GHES path where BIOS indicates
a corrected error threshold was exceeded. There's also the
drivers/ras/cec.c path where Linux does it's own threshold
counting.
> 
> If your read about MADV_SOFT_OFFLINE in the man page it clearly says:
> 
> "This feature is intended for testing of memory error-handling code; it is
> available  only if the kernel was configured with CONFIG_MEMORY_FAILURE."

Agreed that this all depends on CONFIG_MEMORY_FAILURE ... so if any
part of the flow is compiled in when that is "=n" then some
changes are needed to fix that.

> 
> So I'm sorry to say: I miss why we should add all this complexity to make a
> feature used for testing soft-offlining work differently for hugetlb folios
> -- with a testing interface.
> 
> -- 
> Cheers
> 
> David / dhildenb

-Tony

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ