lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20251129090813.GK3538@ZenIV>
Date: Sat, 29 Nov 2025 09:08:13 +0000
From: Al Viro <viro@...iv.linux.org.uk>
To: Xie Yuanbin <xieyuanbin1@...wei.com>
Cc: torvalds@...ux-foundation.org, will@...nel.org, linux@...linux.org.uk,
	bigeasy@...utronix.de, rmk+kernel@...linux.org.uk,
	akpm@...ux-foundation.org, brauner@...nel.org,
	catalin.marinas@....com, hch@....de, jack@...e.com,
	linux-arm-kernel@...ts.infradead.org, linux-fsdevel@...r.kernel.org,
	linux-kernel@...r.kernel.org, linux-mm@...ck.org,
	pangliyuan1@...wei.com, wangkefeng.wang@...wei.com,
	wozizhi@...weicloud.com, yangerkun@...wei.com, lilinjie8@...wei.com,
	liaohua4@...wei.com
Subject: Re: [Bug report] hash_name() may cross page boundary and trigger

On Sat, Nov 29, 2025 at 12:08:17PM +0800, Xie Yuanbin wrote:

> I think the `user_mode(regs)` check is necessary because the label
> no_context actually jumps to __do_kernel_fault(), whereas page fault
> from user mode should jump to `__do_user_fault()`.
> 
> Alternatively, we would need to change `goto no_context` to
> `goto bad_area`. Or perhaps I misunderstood something, please point it out.

FWIW, goto bad_area has an obvious problem: uses of 'fault' value, which
contains garbage.

The cause of problem is the heuristics in get_mmap_lock_carefully():
	if (regs && !user_mode(regs)) {
		unsigned long ip = exception_ip(regs);
		if (!search_exception_tables(ip))
			return false;
	}
trylock has failed and we are trying to decide whether it's safe to block.
The assumption (inherited from old logics in assorted page fault handlers)
is "by that point we know that fault in kernel mode is either an oops
or #PF on uaccess; in the latter case we should be OK with locking mm,
in the former we should just get to oopsing without risking deadlocks".

load_unaligned_zeropad() is where that assumption breaks - there is
an exception handler and it's not an uaccess attempt; the address is
not going to match any VMA and we really don't want to do anything
blocking.

Note that VMA lookup will return NULL there anyway - there won't be a VMA
for that address.  What we get is exactly the same thing we'd get from
do_bad_area(), whether we get a kernel or userland insn faulting.

The minimal fix would be something like
	if (unlikely(addr >= TASK_SIZE) && !(flags & FAULT_FLAG_USER))
		goto no_context;

right before
	if (!(flags & FAULT_FLAG_USER))
		goto lock_mmap;

in do_page_fault().  Alternatively,
	if (unlikely(addr >= TASK_SIZE)) {
		do_bad_area(addr, fsr, regs);
		return 0;
	}
or
	if (unlikely(addr >= TASK_SIZE)) {
		fault = 0;
		code = SEGV_MAPERR;
		goto bad_area;
	}
at the same place.  Incidentally, making do_bad_area() return 0 would
seem to make all callers happier...

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ