[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <836f2574-cb60-44c5-865c-7f13a90779ec@redhat.com>
Date: Mon, 12 May 2025 14:05:45 +0200
From: David Hildenbrand <david@...hat.com>
To: Ryan Roberts <ryan.roberts@....com>,
Catalin Marinas <catalin.marinas@....com>, Will Deacon <will@...nel.org>,
Pasha Tatashin <pasha.tatashin@...een.com>,
Andrew Morton <akpm@...ux-foundation.org>,
Uladzislau Rezki <urezki@...il.com>, Christoph Hellwig <hch@...radead.org>,
"Matthew Wilcox (Oracle)" <willy@...radead.org>,
Mark Rutland <mark.rutland@....com>,
Anshuman Khandual <anshuman.khandual@....com>,
Alexandre Ghiti <alexghiti@...osinc.com>,
Kevin Brodsky <kevin.brodsky@....com>
Cc: linux-arm-kernel@...ts.infradead.org, linux-mm@...ck.org,
linux-kernel@...r.kernel.org,
syzbot+5c0d9392e042f41d45c5@...kaller.appspotmail.com
Subject: Re: [PATCH] arm64/mm: Disable barrier batching in interrupt contexts
>>> static inline void arch_leave_lazy_mmu_mode(void)
>>> {
>>> + if (in_interrupt())
>>> + return;
>>> +
>>> arch_flush_lazy_mmu_mode();
>>> clear_thread_flag(TIF_LAZY_MMU);
>>> }
>>
>> I guess in all cases we could optimize out the in_interrupt() check on !debug
>> configs.
>
> I think that assumes we can easily and accurately identify all configs that
> cause this? We've identified 2 but I'm not confident that it's a full list.
Agreed. I was wondering if we could convert the ones to use different
pte helpers, whereby these helpers would not be available without
CONFIG_WHATEVER. Then, make these features select CONFIG_WHATEVER.
VM_WARN_ON_* would be used to catch any violations / wrong use of pte
helpers.
> Also, KFENCE isn't really a debug config (despite me calling it that in the
> commit log) - it's supposed to be something that can be enabled in production
> builds.
Agreed. Even Fedora has it.
>
>>
>> Hm, maybe there is an elegant way to catch all of these "problematic" users?
>
> I'm all ears if you have any suggestions? :)
>
>
> It actaully looks like x86/XEN tries to solves this problem in a similar way:
Heh, yes. Good old xen ...
>
> enum xen_lazy_mode xen_get_lazy_mode(void)
> {
> if (in_interrupt())
> return XEN_LAZY_NONE;
>
> return this_cpu_read(xen_lazy_mode);
> }
>
> Although I'm not convinced it's fully robust. It also has:
>
> static inline void enter_lazy(enum xen_lazy_mode mode)
> {
> BUG_ON(this_cpu_read(xen_lazy_mode) != XEN_LAZY_NONE);
>
> this_cpu_write(xen_lazy_mode, mode);
> }
>
> which is called as part of its arch_enter_lazy_mmu_mode() implementation. If a
> task was already in lazy mmu mode when an interrupt comes in and causes the
> nested arch_enter_lazy_mmu_mode() that we saw in this bug report, surely that
> BUG_ON() should trigger?
Hm, good point. But that code is old, so probably something seems to be
preventing that?
In any case, just a thought on the in_interrupt() check, I think this
commit is good enough as is.
--
Cheers,
David / dhildenb
Powered by blists - more mailing lists