[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <f781e51a-b193-4bd1-abf0-71f816aa0d5e@os.amperecomputing.com>
Date: Mon, 19 May 2025 12:31:04 -0700
From: Yang Shi <yang@...amperecomputing.com>
To: Lorenzo Stoakes <lorenzo.stoakes@...cle.com>,
Andrew Morton <akpm@...ux-foundation.org>
Cc: James Houghton <jthoughton@...gle.com>,
Christian Borntraeger <borntraeger@...ux.ibm.com>,
Ignacio Moreno Gonzalez <Ignacio.MorenoGonzalez@...a.com>,
David Hildenbrand <david@...hat.com>,
"Liam R . Howlett" <Liam.Howlett@...cle.com>,
Matthew Wilcox <willy@...radead.org>, Janosch Frank <frankja@...ux.ibm.com>,
Heiko Carstens <hca@...ux.ibm.com>, Vasily Gorbik <gor@...ux.ibm.com>,
Alexander Gordeev <agordeev@...ux.ibm.com>,
Sven Schnelle <svens@...ux.ibm.com>, pbonzini@...hat.com,
kvm@...r.kernel.org, linux-s390@...r.kernel.org, linux-mm@...ck.org,
linux-kernel@...r.kernel.org
Subject: Re: [PATCH 0/2] mm: madvise: make MADV_NOHUGEPAGE a no-op if !THP
On 5/19/25 7:43 AM, Lorenzo Stoakes wrote:
> Andrew -
>
> OK, I realise there's an issue here with patch 2/2. We're not accounting
> for the fact that madvise() will reject this _anyway_ because
> madvise_behavior_valid() will reject it.
Good catch. The purpose of this patch is to make MADV_NOHUGEPAGE a
no-op, so we can just simply bail out early? The point of madvise
behavior check is to avoid taking mmap_lock and walking vmas for invalid
behavior, but it doesn't consider no-op (I treat op-op as valid but do
nothing), so if we know this advise is a no-op, we just bail out by
returning 0.
Maybe MADV_UNMERGEABLE should be no-op too, it returns 0 for !KSM anyway.
Thanks,
Yang
>
> I've tried to be especially helpful here to aid Ignacio in his early
> contributions, but I think it's best now (if you don't mind Igancio) for me
> to figure out a better solution after the merge window.
>
> We're late in the cycle now so I will just resend the 1st patch (for s390)
> separately if you're happy to take that for 6.16? It's a simple rename of
> an entirely static identifier so should present no risk, and is approved by
> the arch maintainers who have also agreed for it to come through the mm
> tree.
>
> Apologies for the mess!
>
> Cheers, Lorenzo
>
> On Thu, May 15, 2025 at 09:15:44PM +0100, Lorenzo Stoakes wrote:
>> Andrew -
>>
>> I hope the explanation below resolves your query about the header include
>> (in [0]), let me know if doing this as a series like this works (we need to
>> enforce the ordering here).
>>
>> Thanks!
>>
>> [0]: 20250514153648.598bb031a2e498b1ac505b60@...ux-foundation.org
>>
>>
>>
>> Currently, when somebody attempts to set MADV_NOHUGEPAGE on a system that
>> does not enable CONFIG_TRANSPARENT_HUGEPAGE the confguration option, this
>> results in an -EINVAL error arising.
>>
>> This doesn't really make sense, as to do so is essentially a no-op.
>>
>> Additionally, the semantics of setting VM_[NO]HUGEPAGE in any case are such
>> that, should the attribute not apply, nothing will be done.
>>
>> It therefore makes sense to simply make this operation a noop.
>>
>> However, a fly in the ointment is that, in order to do so, we must check
>> against the MADV_NOHUGEPAGE constant. In doing so, we encounter two rather
>> annoying issues.
>>
>> The first is that the usual include we would import to get hold of
>> MADV_NOHUGEPAGE, linux/mman.h, results in a circular dependency:
>>
>> * If something includes linux/mman.h, we in turn include linux/mm.h prior
>> to declaring MADV_NOHUGEPAGE.
>> * This then, in turn, includes linux/huge_mm.h.
>> * linux/huge_mm.h declares hugepage_madvise(), which then tries to
>> reference MADV_NOHUGEPAGE, and the build fails.
>>
>> This can be reached in other ways too.
>>
>> So we work around this by including uapi/asm/mman.h instead, which allows
>> us to keep hugepage_madvise() inline.
>>
>> The second issue is that the s390 arch declares PROT_NONE as a value in the
>> enum prot_type enumeration.
>>
>> By updating the include in linux/huge_mm.h, we pull in the PROT_NONE
>> declaration (unavoidably, this is ultimately in
>> uapi/asm-generic/mman-common.h alongside MADV_NOHUGEPAGE), which collides
>> with the enumeration value.
>>
>> To resolve this, we rename PROT_NONE to PROT_TYPE_DUMMY.
>>
>> The ordering of these patches is critical, the s390 patch must be applied
>> prior to the MADV_NOHUGEPAGE patch, and therefore the two patches are sent
>> as a series.
>>
>> v1:
>> * Place patches in series.
>> * Correct typo in comment as per James.
>>
>> previous patches:
>> huge_mm.h patch - https://lore.kernel.org/all/20250508-madvise-nohugepage-noop-without-thp-v1-1-e7ceffb197f3@kuka.com/
>> s390 patch - https://lore.kernel.org/all/20250514163530.119582-1-lorenzo.stoakes@oracle.com/
>>
>> Ignacio Moreno Gonzalez (1):
>> mm: madvise: make MADV_NOHUGEPAGE a no-op if !THP
>>
>> Lorenzo Stoakes (1):
>> KVM: s390: rename PROT_NONE to PROT_TYPE_DUMMY
>>
>> arch/s390/kvm/gaccess.c | 8 ++++----
>> include/linux/huge_mm.h | 5 +++++
>> 2 files changed, 9 insertions(+), 4 deletions(-)
>>
>> --
>> 2.49.0
Powered by blists - more mailing lists