[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <aS7e9CbQXS27sGcd@shell.armlinux.org.uk>
Date: Tue, 2 Dec 2025 12:43:32 +0000
From: "Russell King (Oracle)" <linux@...linux.org.uk>
To: Linus Torvalds <torvalds@...ux-foundation.org>
Cc: Zizhi Wo <wozizhi@...weicloud.com>,
Catalin Marinas <catalin.marinas@....com>,
Will Deacon <will@...nel.org>, jack@...e.com, brauner@...nel.org,
hch@....de, akpm@...ux-foundation.org,
linux-fsdevel@...r.kernel.org, linux-kernel@...r.kernel.org,
linux-mm@...ck.org, linux-arm-kernel@...ts.infradead.org,
yangerkun@...wei.com, wangkefeng.wang@...wei.com,
pangliyuan1@...wei.com, xieyuanbin1@...wei.com
Subject: Re: [Bug report] hash_name() may cross page boundary and trigger
sleep in RCU context
On Fri, Nov 28, 2025 at 09:06:50AM -0800, Linus Torvalds wrote:
> I don't think it's necessarily all that big of a deal. Yeah, this is
> old code, and yeah, it could probably be cleaned up a bit, but at the
> same time, "old and crusty" also means "fairly well tested". This
> whole fault on a kernel address is a fairly unusual case, and as
> mentioned, I *think* the above fix is sufficient.
We have another issue in the code - which has the branch predictor
hardening for spectre issues, which can be called with interrupts
enabled, causing a kernel warning - obviously not good.
There's another issue which PREEMPT_RT has picked up on - which is
that delivering signals via __do_user_fault() with interrupts disabled
causes spinlocks (which can sleep on PREEMPT_RT) to warn.
What I'm thinking is to address both of these by handling kernel space
page faults (which will be permission or PTE-not-present) separately
(not even build tested):
diff --git a/arch/arm/mm/fault.c b/arch/arm/mm/fault.c
index 2bc828a1940c..972bce697c6c 100644
--- a/arch/arm/mm/fault.c
+++ b/arch/arm/mm/fault.c
@@ -175,7 +175,8 @@ __do_kernel_fault(struct mm_struct *mm, unsigned long addr, unsigned int fsr,
/*
* Something tried to access memory that isn't in our memory map..
- * User mode accesses just cause a SIGSEGV
+ * User mode accesses just cause a SIGSEGV. Ensure interrupts are enabled
+ * here, which is safe as the fault being handled is from userspace.
*/
static void
__do_user_fault(unsigned long addr, unsigned int fsr, unsigned int sig,
@@ -183,8 +184,7 @@ __do_user_fault(unsigned long addr, unsigned int fsr, unsigned int sig,
{
struct task_struct *tsk = current;
- if (addr > TASK_SIZE)
- harden_branch_predictor();
+ local_irq_enable();
#ifdef CONFIG_DEBUG_USER
if (((user_debug & UDBG_SEGV) && (sig == SIGSEGV)) ||
@@ -259,6 +259,38 @@ static inline bool ttbr0_usermode_access_allowed(struct pt_regs *regs)
}
#endif
+static int __kprobes
+do_kernel_address_page_fault(unsigned long addr, unsigned int fsr,
+ struct pt_regs *regs)
+{
+ if (user_mode(regs)) {
+ /*
+ * Fault from user mode for a kernel space address. User mode
+ * should not be faulting in kernel space, which includes the
+ * vector/khelper page. Handle the Spectre issues while
+ * interrupts are still disabled, then send a SIGSEGV. Note
+ * that __do_user_fault() will enable interrupts.
+ */
+ harden_branch_predictor();
+ __do_user_fault(addr, fsr, SIGSEGV, SEGV_MAPERR, regs);
+ } else {
+ /*
+ * Fault from kernel mode. Enable interrupts if they were
+ * enabled in the parent context. Section (upper page table)
+ * translation faults are handled via do_translation_fault(),
+ * so we will only get here for a non-present kernel space
+ * PTE or kernel space permission fault. Both of these should
+ * not happen.
+ */
+ if (interrupts_enabled(regs))
+ local_irq_enable();
+
+ __do_kernel_fault(mm, addr, fsr, regs);
+ }
+
+ return 0;
+}
+
static int __kprobes
do_page_fault(unsigned long addr, unsigned int fsr, struct pt_regs *regs)
{
@@ -272,6 +304,8 @@ do_page_fault(unsigned long addr, unsigned int fsr, struct pt_regs *regs)
if (kprobe_page_fault(regs, fsr))
return 0;
+ if (addr >= TASK_SIZE)
+ return do_kernel_address_page_fault(addr, fsr, regs);
/* Enable interrupts if they were enabled in the parent context. */
if (interrupts_enabled(regs))
... and I think there was a bug in the branch predictor handling -
addr == TASK_SIZE should have been included.
Does this look sensible?
--
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 80Mbps down 10Mbps up. Decent connectivity at last!
Powered by blists - more mailing lists