[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20251126204201.GF3538@ZenIV>
Date: Wed, 26 Nov 2025 20:42:01 +0000
From: Al Viro <viro@...iv.linux.org.uk>
To: Xie Yuanbin <xieyuanbin1@...wei.com>
Cc: brauner@...nel.org, jack@...e.cz, linux@...linux.org.uk,
will@...nel.org, nico@...xnic.net, akpm@...ux-foundation.org,
hch@....de, jack@...e.com, wozizhi@...weicloud.com,
linux-fsdevel@...r.kernel.org, linux-kernel@...r.kernel.org,
linux-arm-kernel@...ts.infradead.org, linux-mm@...ck.org,
lilinjie8@...wei.com, liaohua4@...wei.com,
wangkefeng.wang@...wei.com, pangliyuan1@...wei.com
Subject: Re: [RFC PATCH] vfs: Fix might sleep in load_unaligned_zeropad()
with rcu read lock held
On Wed, Nov 26, 2025 at 06:19:52PM +0800, Xie Yuanbin wrote:
> On latest linux-next source, using arm32's multi_v7_defconfig, and
> setting CONFIG_PREEMPT=y, CONFIG_DEBUG_ATOMIC_SLEEP=y, CONFIG_KFENCE=y,
> CONFIG_ARM_PAN=n, then run the following testcase:
> ```c
> static void *thread(void *arg)
> {
> while (1) {
> void *p = mmap(NULL, 4096, PROT_READ | PROT_WRITE, MAP_ANON | MAP_PRIVATE, -1, 0);
>
> assert(p != (void *)-1);
> __asm__ volatile ("":"+r"(p)::"memory");
>
> munmap(p, 4096);
> }
> }
>
> int main()
> {
> pthread_t th;
> int ret;
> char path[4096] = "/tmp";
>
> for (size_t i = 0; i < 2044; ++i) {
> strcat(path, "/x");
> ret = mkdir(path, 0755);
> assert(ret == 0 || errno == EEXIST);
> }
> strcat(path, "/xx");
>
> assert(strlen(path) == 4095);
>
> assert(pthread_create(&th, NULL, thread, NULL) == 0);
>
> while (1) {
> FILE *fp = fopen(path, "wb+");
>
> assert(fp);
> fclose(fp);
> }
> return 0;
> }
> ```
> The might sleep warning will be triggered immediately.
"Immediately" part is interesting - presumably KFENCE is playing silly
buggers with PTEs in there.
Anyway, the underlying bug is that fault in this scenario should not
even look at VMAs - it should get to fixup_exception() and be done
with that, with minimal overhead for all other cause of faults.
We have an unaligned 32bit fetch from kernel address, spanning the
page boundary, with the second page unmapped or unreadable. Access
comes from kernel mode. All we want is to fail the fault without
an oops, blocking, etc.
AFAICS, on arm32 looks for VMA at address > TASK_SIZE won't find
a damn thing anyway, so skipping these attempts and going to
bad_area looks safe enough, if we do that after all early cases...
Powered by blists - more mailing lists