[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <37fc3553-0d5a-4bdc-b473-cd740d47598e@linux.alibaba.com>
Date: Wed, 25 Jun 2025 18:02:51 +0800
From: Baolin Wang <baolin.wang@...ux.alibaba.com>
To: Lorenzo Stoakes <lorenzo.stoakes@...cle.com>
Cc: David Hildenbrand <david@...hat.com>, Hugh Dickins <hughd@...gle.com>,
akpm@...ux-foundation.org, ziy@...dia.com, Liam.Howlett@...cle.com,
npache@...hat.com, ryan.roberts@....com, dev.jain@....com,
baohua@...nel.org, zokeefe@...gle.com, shy828301@...il.com,
usamaarif642@...il.com, linux-mm@...ck.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH v4 0/2] fix MADV_COLLAPSE issue if THP settings are
disabled
On 2025/6/25 17:31, Lorenzo Stoakes wrote:
> On Wed, Jun 25, 2025 at 04:52:03PM +0800, Baolin Wang wrote:
>>
>>
>> On 2025/6/25 16:37, Lorenzo Stoakes wrote:
>>> Yeah maybe the best way is to just have another tunable for this?
>>>
>>> /sys/kernel/mm/transparent_hugepage/disable_collapse perhaps?
>>>
>>> What do you think Hugh, Baolin?
>>
>> I think it's not necessary to find a way to disable madvise_collapse.
>> Essentially, it's a conflict between the semantics of madvise_collapse and
>> the '/sys/kernel/mm/transparent_hugepage/enabled' interface. We should reach
>> a consensus on the semantics first:
>>
>> Semantic 1: madv_collapse() should ignore any THP system settings, meaning
>> we need to update the 'never' semantics in
>> '/sys/kernel/mm/transparent_hugepage/enabled', which would only disable page
>> fault and khugepaged, not including madvise_collapse. If we agree on this,
>> then the 'never' for per-sized mTHP would have the same semantics, i.e.,
>> when I set 64K mTHP to 'always' and 2M mTHP to 'never', madvise_collapse
>> would still allow the collapse of 2M THP. We should document this clearly in
>> case users still want 64K mTHP from madvise_collapse.
>
> Right yeah, I mean this is in effect how things are now. So the task is
> documentation.
>
>>
>>
>> Semantic 2: madv_collapse() needs to respect THP system settings, which is
>> what my patch does. Never means never, and we would need to update the
>> documentation of madv_collapse() to make it clearer.
>
> Yes, and indeed this is the choice.
>
> I think, as David said, it comes down to whether we have a legit use case that
> truly relies on this.
>
>>> (One side note on PMD-sized MADV_COLLAPSE - this is basically completely
>>> useless for 64 KB page size arm64 systems where PMD's are 512 MB :)
>>>
>>> Thoughts Baolin?
>>
>> We should not collapse 512MB THP on 64K pagesize kernel. So seems
>> madv_collapse() can not work on 64K pagesize kernel.
>
> Well I don't think anything would prevent this now right? So MADV_COLLAPSE is
> pretty problematic on 64K pagesize kernels in general.
Yes, I don't mean it will prevent madvise_collapse(), just as you said
that it could be problematic (it's horrible to try to collapse 512MB).
> Anyway that's maybe a problem for another time :)
Yeah, should consider mTHP-compatible MADV_COLLAPSE.
Powered by blists - more mailing lists