[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <49731a4e-d00f-4f84-aaac-87405d6eadbf@arm.com>
Date: Sun, 21 Sep 2025 16:55:14 +0530
From: Anshuman Khandual <anshuman.khandual@....com>
To: David Hildenbrand <david@...hat.com>, Kyle Meyer <kyle.meyer@....com>
Cc: akpm@...ux-foundation.org, corbet@....net, linmiaohe@...wei.com,
shuah@...nel.org, tony.luck@...el.com, jane.chu@...cle.com,
jiaqiyan@...gle.com, Liam.Howlett@...cle.com, bp@...en8.de,
hannes@...xchg.org, jack@...e.cz, joel.granados@...nel.org,
laoar.shao@...il.com, lorenzo.stoakes@...cle.com, mclapinski@...gle.com,
mhocko@...e.com, nao.horiguchi@...il.com, osalvador@...e.de,
rafael.j.wysocki@...el.com, rppt@...nel.org, russ.anderson@....com,
shawn.fan@...el.com, surenb@...gle.com, vbabka@...e.cz,
linux-acpi@...r.kernel.org, linux-doc@...r.kernel.org,
linux-kernel@...r.kernel.org, linux-kselftest@...r.kernel.org,
linux-mm@...ck.org
Subject: Re: [PATCH v2] mm/memory-failure: Support disabling soft offline for
HugeTLB pages
On 18/09/25 12:35 AM, David Hildenbrand wrote:
> On 17.09.25 20:51, Kyle Meyer wrote:
>> On Wed, Sep 17, 2025 at 09:02:55AM +0200, David Hildenbrand wrote:
>>>
>>>>> +
>>>>> + 0 - Enable soft offline
>>>>> + 1 - Disable soft offline for HugeTLB pages
>>>>> +
>>>>> +Supported values::
>>>>> +
>>>>> + 0 - Soft offline is disabled
>>>>> + 1 - Soft offline is enabled
>>>>> + 3 - Soft offline is enabled (disabled for HugeTLB pages)
>>>>
>>>> This looks very adhoc even though existing behavior is preserved.
>>>>
>>>> - Are HugeTLB pages the only page types to be considered ?
>>>> - How the remaining bits here are going to be used later ?
>>>>
>>>
>>> What I proposed (that could be better documented here) is that all other
>>> bits except the first one will be a disable mask when bit 0 is set.
>>>
>>> 2 - ... but yet disabled for hugetlb
>>> 4 - ... but yet disabled for $WHATEVER
>>> 8 - ... but yet disabled for $WHATEVERELSE
>>>
>>>> Also without a bit-wise usage roadmap, is not changing a procfs
>>>> interface (ABI) bit problematic ?
>>>
>>> For now we failed setting it to values that are neither 0 or 1, IIUC
>>> set_enable_soft_offline() correctly?
>>
>> Yes, -EINVAL will be returned.
>>
>>> So there should not be any problem, or which scenario do you have in mind?
>>
>> Here's an alternative approach.
>>
>> Do not modify the existing sysctl parameter:
>>
>> /proc/sys/vm/enable_soft_offline
>>
>> 0 - Soft offline is disabled
>> 1 - Soft offline is enabled
>>
>> Instead, introduce a new sysctl parameter:
>>
>> /proc/sys/vm/enable_soft_offline_hugetlb
>>
>> 0 - Soft offline is disabled for HugeTLB pages
>> 1 - Soft offline is enabled for HugeTLB pages
>>
>> and note in documentation that this setting only takes effect if
>> enable_soft_offline is enabled.
>>
>> Anshuman (and David), would you prefer this?
>
> Hmm, at least I don't particularly like that. For each new exception we would create a new file, and the file has weird semantics such that it has no meaning when enable_soft_offline=0.
Agree with David here. Adding a new procfs file for a particular page
type's soft offline disable scenario does not really make sense. This
will extend the ABI unnecessarily without adding much benefit.
Powered by blists - more mailing lists