lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20251202130224.16376-1-xieyuanbin1@huawei.com>
Date: Tue, 2 Dec 2025 21:02:24 +0800
From: Xie Yuanbin <xieyuanbin1@...wei.com>
To: <linux@...linux.org.uk>
CC: <akpm@...ux-foundation.org>, <brauner@...nel.org>,
	<catalin.marinas@....com>, <hch@....de>, <jack@...e.com>,
	<linux-arm-kernel@...ts.infradead.org>, <linux-fsdevel@...r.kernel.org>,
	<linux-kernel@...r.kernel.org>, <linux-mm@...ck.org>,
	<pangliyuan1@...wei.com>, <torvalds@...ux-foundation.org>,
	<wangkefeng.wang@...wei.com>, <will@...nel.org>, <wozizhi@...weicloud.com>,
	<xieyuanbin1@...wei.com>, <yangerkun@...wei.com>
Subject: Re: [Bug report] hash_name() may cross page boundary and trigger sleep in RCU context

On Tue, 2 Dec 2025 12:43:32 +0000, Russell King (Oracle) wrote:
> We have another issue in the code - which has the branch predictor
> hardening for spectre issues, which can be called with interrupts
> enabled, causing a kernel warning - obviously not good.
>
> There's another issue which PREEMPT_RT has picked up on - which is
> that delivering signals via __do_user_fault() with interrupts disabled
> causes spinlocks (which can sleep on PREEMPT_RT) to warn.
>
> What I'm thinking is to address both of these by handling kernel space
> page faults (which will be permission or PTE-not-present) separately
> (not even build tested):
>
> diff --git a/arch/arm/mm/fault.c b/arch/arm/mm/fault.c
> index 2bc828a1940c..972bce697c6c 100644
> --- a/arch/arm/mm/fault.c
> +++ b/arch/arm/mm/fault.c
> @@ -175,7 +175,8 @@ __do_kernel_fault(struct mm_struct *mm, unsigned long addr, unsigned int fsr,
>
>  /*
>   * Something tried to access memory that isn't in our memory map..
> - * User mode accesses just cause a SIGSEGV
> + * User mode accesses just cause a SIGSEGV. Ensure interrupts are enabled
> + * here, which is safe as the fault being handled is from userspace.
>   */
>  static void
>  __do_user_fault(unsigned long addr, unsigned int fsr, unsigned int sig,
> @@ -183,8 +184,7 @@ __do_user_fault(unsigned long addr, unsigned int fsr, unsigned int sig,
>  {
>  	struct task_struct *tsk = current;
> 
> -	if (addr > TASK_SIZE)
> -		harden_branch_predictor();
> +	local_irq_enable();
>
>  #ifdef CONFIG_DEBUG_USER
>  	if (((user_debug & UDBG_SEGV) && (sig == SIGSEGV)) ||
> @@ -259,6 +259,38 @@ static inline bool ttbr0_usermode_access_allowed(struct pt_regs *regs)
>  }
>  #endif
>
> +static int __kprobes
> +do_kernel_address_page_fault(unsigned long addr, unsigned int fsr,
> +			     struct pt_regs *regs)
> +{
> +	if (user_mode(regs)) {
> +		/*
> +		 * Fault from user mode for a kernel space address. User mode
> +		 * should not be faulting in kernel space, which includes the
> +		 * vector/khelper page. Handle the Spectre issues while
> +		 * interrupts are still disabled, then send a SIGSEGV. Note
> +		 * that __do_user_fault() will enable interrupts.
> +		 */
> +		harden_branch_predictor();
> +		__do_user_fault(addr, fsr, SIGSEGV, SEGV_MAPERR, regs);
> +	} else {
> +		/*
> +		 * Fault from kernel mode. Enable interrupts if they were
> +		 * enabled in the parent context. Section (upper page table)
> +		 * translation faults are handled via do_translation_fault(),
> +		 * so we will only get here for a non-present kernel space
> +		 * PTE or kernel space permission fault. Both of these should
> +		 * not happen.
> +		 */
> +		if (interrupts_enabled(regs))
> +			local_irq_enable();
> +
> +		__do_kernel_fault(mm, addr, fsr, regs);
> +	}
> +
> +	return 0;
> +}
> +
>  static int __kprobes
>  do_page_fault(unsigned long addr, unsigned int fsr, struct pt_regs *regs)
>  {
> @@ -272,6 +304,8 @@ do_page_fault(unsigned long addr, unsigned int fsr, struct pt_regs *regs)
>  	if (kprobe_page_fault(regs, fsr))
>  		return 0;
>
> +	if (addr >= TASK_SIZE)
> +		return do_kernel_address_page_fault(addr, fsr, regs);
>
>  	/* Enable interrupts if they were enabled in the parent context. */
>  	if (interrupts_enabled(regs))
>
> ... and I think there was a bug in the branch predictor handling -
> addr == TASK_SIZE should have been included.
>
> Does this look sensible?

Hi, Russell King!

This patch removes
```c
	if (addr > TASK_SIZE)
		harden_branch_predictor();
```
from do_user_fault(), and adds it to do_page_fault()->
do_kernel_address_page_fault().

However, do_user_fault() is not only called by do_page_fault(). It is
also called by do_bad_area(), do_sect_fault() and do_translation_fault().
I am not sure that if this will lead to some missing
harden_branch_predictor() mitigation.

What about something like this:
```patch
diff --git a/arch/arm/mm/fault.c b/arch/arm/mm/fault.c
index 2bc828a1940c..5c58072d8235 100644
--- a/arch/arm/mm/fault.c
+++ b/arch/arm/mm/fault.c
@@ -270,10 +270,15 @@ do_page_fault(unsigned long addr, unsigned int fsr, struct pt_regs *regs)
 	vm_flags_t vm_flags = VM_ACCESS_FLAGS;

 	if (kprobe_page_fault(regs, fsr))
 		return 0;

+	if (unlikely(addr >= TASK_SIZE)) {
+		fault = 0;
+		code = SEGV_MAPERR;
+		goto bad_area;
+	}

 	/* Enable interrupts if they were enabled in the parent context. */
 	if (interrupts_enabled(regs))
 		local_irq_enable();
diff --git a/arch/arm/mm/fault.c b/arch/arm/mm/fault.c
index 5c58072d8235..f8ee1854c854 100644
--- a/arch/arm/mm/fault.c
+++ b/arch/arm/mm/fault.c
@@ -184,10 +184,13 @@ __do_user_fault(unsigned long addr, unsigned int fsr, unsigned int sig,
 	struct task_struct *tsk = current;
 
 	if (addr > TASK_SIZE)
 		harden_branch_predictor();
 
+	if (IS_ENABLED(CONFIG_PREEMPT_RT))
+		local_irq_enable();
+
 #ifdef CONFIG_DEBUG_USER
 	if (((user_debug & UDBG_SEGV) && (sig == SIGSEGV)) ||
 	    ((user_debug & UDBG_BUS)  && (sig == SIGBUS))) {
 		pr_err("8<--- cut here ---\n");
 		pr_err("%s: unhandled page fault (%d) at 0x%08lx, code 0x%03x\n",
```

Thanks very much!

Xie Yuanbin

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ