[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20251129090813.GK3538@ZenIV>
Date: Sat, 29 Nov 2025 09:08:13 +0000
From: Al Viro <viro@...iv.linux.org.uk>
To: Xie Yuanbin <xieyuanbin1@...wei.com>
Cc: torvalds@...ux-foundation.org, will@...nel.org, linux@...linux.org.uk,
bigeasy@...utronix.de, rmk+kernel@...linux.org.uk,
akpm@...ux-foundation.org, brauner@...nel.org,
catalin.marinas@....com, hch@....de, jack@...e.com,
linux-arm-kernel@...ts.infradead.org, linux-fsdevel@...r.kernel.org,
linux-kernel@...r.kernel.org, linux-mm@...ck.org,
pangliyuan1@...wei.com, wangkefeng.wang@...wei.com,
wozizhi@...weicloud.com, yangerkun@...wei.com, lilinjie8@...wei.com,
liaohua4@...wei.com
Subject: Re: [Bug report] hash_name() may cross page boundary and trigger
On Sat, Nov 29, 2025 at 12:08:17PM +0800, Xie Yuanbin wrote:
> I think the `user_mode(regs)` check is necessary because the label
> no_context actually jumps to __do_kernel_fault(), whereas page fault
> from user mode should jump to `__do_user_fault()`.
>
> Alternatively, we would need to change `goto no_context` to
> `goto bad_area`. Or perhaps I misunderstood something, please point it out.
FWIW, goto bad_area has an obvious problem: uses of 'fault' value, which
contains garbage.
The cause of problem is the heuristics in get_mmap_lock_carefully():
if (regs && !user_mode(regs)) {
unsigned long ip = exception_ip(regs);
if (!search_exception_tables(ip))
return false;
}
trylock has failed and we are trying to decide whether it's safe to block.
The assumption (inherited from old logics in assorted page fault handlers)
is "by that point we know that fault in kernel mode is either an oops
or #PF on uaccess; in the latter case we should be OK with locking mm,
in the former we should just get to oopsing without risking deadlocks".
load_unaligned_zeropad() is where that assumption breaks - there is
an exception handler and it's not an uaccess attempt; the address is
not going to match any VMA and we really don't want to do anything
blocking.
Note that VMA lookup will return NULL there anyway - there won't be a VMA
for that address. What we get is exactly the same thing we'd get from
do_bad_area(), whether we get a kernel or userland insn faulting.
The minimal fix would be something like
if (unlikely(addr >= TASK_SIZE) && !(flags & FAULT_FLAG_USER))
goto no_context;
right before
if (!(flags & FAULT_FLAG_USER))
goto lock_mmap;
in do_page_fault(). Alternatively,
if (unlikely(addr >= TASK_SIZE)) {
do_bad_area(addr, fsr, regs);
return 0;
}
or
if (unlikely(addr >= TASK_SIZE)) {
fault = 0;
code = SEGV_MAPERR;
goto bad_area;
}
at the same place. Incidentally, making do_bad_area() return 0 would
seem to make all callers happier...
Powered by blists - more mailing lists