linux-kernel - Re: [Bug report] hash_name() may cross page boundary and trigger sleep in RCU context

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20251129033728.GH3538@ZenIV>
Date: Sat, 29 Nov 2025 03:37:28 +0000
From: Al Viro <viro@...iv.linux.org.uk>
To: Zizhi Wo <wozizhi@...weicloud.com>
Cc: torvalds@...ux-foundation.org, jack@...e.com, brauner@...nel.org,
	hch@....de, akpm@...ux-foundation.org, linux@...linux.org.uk,
	linux-fsdevel@...r.kernel.org, linux-kernel@...r.kernel.org,
	linux-mm@...ck.org, linux-arm-kernel@...ts.infradead.org,
	yangerkun@...wei.com, wangkefeng.wang@...wei.com,
	pangliyuan1@...wei.com, xieyuanbin1@...wei.com
Subject: Re: [Bug report] hash_name() may cross page boundary and trigger
 sleep in RCU context

On Thu, Nov 27, 2025 at 10:24:19AM +0800, Zizhi Wo wrote:

> Why does x86 have special handling in do_kern_addr_fault(), including
> logic for vmalloc faults? For example, on CONFIG_X86_32, it still takes
> the vmalloc_fault path. As noted in the x86 comments, "We can fault-in
> kernel-space virtual memory on-demand"...
> 
> But on arm64, I don’t see similar logic — is there a specific reason
> for this difference? Maybe x86's vmalloc area is mapped lazily, while
> ARM maps it fully during early boot?

x86 MMU uses the same register for kernel and userland top-level page
tables; arm64 MMU has separate page tables for those - TTBR0 and TTBR1
point to the table to be used for translation, depending upon the bit
55 of virtual address.

vmalloc works with page table of init_mm (see pgd_offset_k() uses in
there).  On arm64 that's it - TTBR1 is set to that and it stays that way,
so access to vmalloc'ed area will do the right thing.

On 32bit x86 you need to propagate the change into top-level page tables
of every thread.  That's what arch_sync_kernel_mappings() is for; look for
the calls in mm/vmalloc.c and see the discussion of race in the comment in
front of x86 vmalloc_fault().  Nothing of that sort is needed of arm64,
since all threads are using the same page table for kernel part of the
address space.

The reason why 64bit x86 doesn't need to bother is different - there we
fill all relevant top-level page table slots in preallocate_vmalloc_pages()
before any additional threads could be created.  The pointers in those
slots are not going to change and they will be propagated to all subsequent
threads by pgd_alloc(), so the page tables actually modified by vmalloc()
are shared by all threads.

AFAICS, 32bit arm is similar to 32bit x86 in that respect; propagation
is lazier, though - there arch_sync_kernel_mappings() bumps a counter
in init_mm and context switches use that to check if propagation needs
to be done.  No idea how well does that work on vfree() side of things -
hadn't looked into that rabbit hole...