[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <14030818-52e7-41eb-8ad7-602f3476d448@redhat.com>
Date: Mon, 27 Oct 2025 17:24:57 +0100
From: David Hildenbrand <david@...hat.com>
To: Kevin Brodsky <kevin.brodsky@....com>, linux-mm@...ck.org
Cc: linux-kernel@...r.kernel.org, Alexander Gordeev <agordeev@...ux.ibm.com>,
Andreas Larsson <andreas@...sler.com>,
Andrew Morton <akpm@...ux-foundation.org>,
Boris Ostrovsky <boris.ostrovsky@...cle.com>, Borislav Petkov
<bp@...en8.de>, Catalin Marinas <catalin.marinas@....com>,
Christophe Leroy <christophe.leroy@...roup.eu>,
Dave Hansen <dave.hansen@...ux.intel.com>,
"David S. Miller" <davem@...emloft.net>, "H. Peter Anvin" <hpa@...or.com>,
Ingo Molnar <mingo@...hat.com>, Jann Horn <jannh@...gle.com>,
Juergen Gross <jgross@...e.com>, "Liam R. Howlett"
<Liam.Howlett@...cle.com>, Lorenzo Stoakes <lorenzo.stoakes@...cle.com>,
Madhavan Srinivasan <maddy@...ux.ibm.com>,
Michael Ellerman <mpe@...erman.id.au>, Michal Hocko <mhocko@...e.com>,
Mike Rapoport <rppt@...nel.org>, Nicholas Piggin <npiggin@...il.com>,
Peter Zijlstra <peterz@...radead.org>, Ryan Roberts <ryan.roberts@....com>,
Suren Baghdasaryan <surenb@...gle.com>, Thomas Gleixner
<tglx@...utronix.de>, Vlastimil Babka <vbabka@...e.cz>,
Will Deacon <will@...nel.org>, Yeoreum Yun <yeoreum.yun@....com>,
linux-arm-kernel@...ts.infradead.org, linuxppc-dev@...ts.ozlabs.org,
sparclinux@...r.kernel.org, xen-devel@...ts.xenproject.org, x86@...nel.org
Subject: Re: [PATCH v3 06/13] mm: introduce generic lazy_mmu helpers
On 24.10.25 16:32, Kevin Brodsky wrote:
> On 24/10/2025 15:27, David Hildenbrand wrote:
>> On 24.10.25 14:13, Kevin Brodsky wrote:
>>> On 23/10/2025 21:52, David Hildenbrand wrote:
>>>> On 15.10.25 10:27, Kevin Brodsky wrote:
>>>>> [...]
>>>>>
>>>>> * madvise_*_pte_range() call arch_leave() in multiple paths, some
>>>>> followed by an immediate exit/rescheduling and some followed by a
>>>>> conditional exit. These functions assume that they are called
>>>>> with lazy MMU disabled and we cannot simply use pause()/resume()
>>>>> to address that. This patch leaves the situation unchanged by
>>>>> calling enable()/disable() in all cases.
>>>>
>>>> I'm confused, the function simply does
>>>>
>>>> (a) enables lazy mmu
>>>> (b) does something on the page table
>>>> (c) disables lazy mmu
>>>> (d) does something expensive (split folio -> take sleepable locks,
>>>> flushes tlb)
>>>> (e) go to (a)
>>>
>>> That step is conditional: we exit right away if pte_offset_map_lock()
>>> fails. The fundamental issue is that pause() must always be matched with
>>> resume(), but as those functions look today there is no situation where
>>> a pause() would always be matched with a resume().
>>
>> We have matches enable/disable, so my question is rather "why" you are
>> even thinking about using pause/resume?
>>
>> What would be the benefit of that? If there is no benefit then just
>> drop this from the patch description as it's more confusing than just
>> ... doing what the existing code does :)
>
> Ah sorry I misunderstood, I guess you originally meant: why would we use
> pause()/resume()?
>
> The issue is the one I mentioned in the commit message: using
> enable()/disable(), we assume that the functions are called with lazy
> MMU mode is disabled. Consider:
>
> lazy_mmu_mode_enable()
> madvise_cold_or_pageout_pte_range():
> lazy_mmu_mode_enable()
> ...
> if (need_resched()) {
> lazy_mmu_mode_disable()
> cond_resched() // lazy MMU still enabled
> }
>
> This will explode on architectures that do not allow sleeping while in
> lazy MMU mode. I'm not saying this is an actual problem - I don't see
> why those functions would be called with lazy MMU mode enabled. But it
> does go against the notion that nesting works everywhere.
I would tackle it from a different direction: if code calls with lazy
MMU enabled into random other code that might sleep, that caller would
be wrong.
It's not about changing functions like this to use pause/resume.
Maybe the rule is simple: if you enable the lazy MMU, don't call any
functions that might sleep.
Maybe we could support that later by handling it before/after sleeping,
if ever required?
Or am I missing something regarding your point on pause()/resume()?
--
Cheers
David / dhildenb
Powered by blists - more mailing lists