[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAEf4Bzboqf+1KUZCb2fBnLZUkzi5X4zOk+wy72eTu3VLB+z7RQ@mail.gmail.com>
Date: Mon, 17 Nov 2025 10:42:15 -0800
From: Andrii Nakryiko <andrii.nakryiko@...il.com>
To: Shaurya Rane <ssrane_b23@...vjti.ac.in>
Cc: Matthew Wilcox <willy@...radead.org>, akpm@...ux-foundation.org, shakeel.butt@...ux.dev,
eddyz87@...il.com, andrii@...nel.org, ast@...nel.org,
linux-fsdevel@...r.kernel.org, linux-mm@...ck.org,
linux-kernel@...r.kernel.org, linux-kernel-mentees@...ts.linux.dev,
skhan@...uxfoundation.org, david.hunter.linux@...il.com, khalid@...nel.org,
syzbot+09b7d050e4806540153d@...kaller.appspotmail.com,
bpf <bpf@...r.kernel.org>
Subject: Re: [PATCH] mm/filemap: fix NULL pointer dereference in do_read_cache_folio()
+ bpf@
On Mon, Nov 17, 2025 at 6:10 AM Shaurya Rane <ssrane_b23@...vjti.ac.in> wrote:
>
> On Sun, Nov 16, 2025 at 10:32:12PM +0000, Matthew Wilcox wrote:
> > First, some process things ;-)
> >
> > 1. Thank you for working on this. Andrii has been ignoring it since
> > August, which is bad. So thank you for picking it up.
It is bad, I'm sorry for this. I was surprised to read this, though,
as I was not aware of any bug related to build ID parsing code, so I
went looking at syzbot history of the issue. August timeframe you are
referring to implies those "Monthly fs report" emails, which
unfortunately I didn't receive as I'm not subscribed to linux-fsdevel,
but I do see that there was earlier report back in April, which I did
get in my inbox, apparently. So I'm sorry again for dropping the ball.
Please feel free to ping me or BPF mailing list next time when you see
something not being addressed in a timely manner.
> >
> > 2. Sending a v2 while we're having a discussion is generally a bad idea.
> > It's fine to send a patch as a reply, but going as far as a v2 isn't
> > necessary. If conversation has died down, then a v2 is definitely
> > warranted, but you and I are still having a discussion ;-)
> >
> > 3. When you do send a v2 (or, now that you've sent a v2, send a v3),
> > do it as a new thread rather then in reply to the v1 thread. That plays
> > better with the tooling we have like b4 which will pull in all patches
> > in a thread.
> >
> Apologies for the process errors regarding the v2 submission. I appreciate the guidance on the workflow and threading; I will ensure the next version is sent as a clean, new thread once we have agreed on the technical solution.
> > With that over with, on to the fun technical stuff.
> >
> > On Sun, Nov 16, 2025 at 11:13:42AM +0530, SHAURYA RANE wrote:
> > > On Sat, Nov 15, 2025 at 2:14 AM Matthew Wilcox <willy@...radead.org> wrote:
> > > >
> > > > On Sat, Nov 15, 2025 at 01:07:29AM +0530, ssrane_b23@...vjti.ac.in wrote:
> > > > > When read_cache_folio() is called with a NULL filler function on a
> > > > > mapping that does not implement read_folio, a NULL pointer
> > > > > dereference occurs in filemap_read_folio().
> > > > >
> > > > > The crash occurs when:
> > > > >
> > > > > build_id_parse() is called on a VMA backed by a file from a
> > > > > filesystem that does not implement ->read_folio() (e.g. procfs,
> > > > > sysfs, or other virtual filesystems).
> > > >
> > > > Not a fan of this approach, to be honest. This should be caught at
> > > > a higher level. In __build_id_parse(), there's already a check:
> > > >
> > > > /* only works for page backed storage */
> > > > if (!vma->vm_file)
> > > > return -EINVAL;
> > > >
> > > > which is funny because the comment is correct, but the code is not.
> > > > I suspect the right answer is to add right after it:
> > > >
> > > > + if (vma->vm_file->f_mapping->a_ops == &empty_aops)
> > > > + return -EINVAL;
> > > >
> > > > Want to test that out?
> > > Thanks for the suggestion.
> > > Checking for
> > > a_ops == &empty_aops
> > > is not enough. Certain filesystems for example XFS with DAX use
> > > their own a_ops table (not empty_aops) but still do not implement
> > > ->read_folio(). In those cases read_cache_folio() still ends up with
> > > filler = NULL and filemap_read_folio(NULL) crashes.
> >
> > Ah, right. I had assumed that the only problem was synthetic
> > filesystems like sysfs and procfs which can't have buildids because
> > buildids only exist in executables. And neither procfs nor sysfs
> > contain executables.
> >
> > But DAX is different. You can absolutely put executables on a DAX
> > filesystem. So we shouldn't filter out DAX here. And we definitely
> > shouldn't *silently* fail for DAX. Otherwise nobody will ever realise
> > that the buildid people just couldn't be bothered to make DAX work.
> >
> > I don't think it's necessarily all that hard to make buildid work
> > for DAX. It's probably something like:
> >
> > if (IS_DAX(file_inode(file)))
> > kernel_read(file, buf, count, &pos);
> >
> > but that's just off the top of my head.
> >
> >
> I agree that DAX needs proper support rather than silent filtering.
> However, investigating the actual syzbot reproducer revealed that the issue extends beyond just DAX. The crash is actually triggering on tmpfs (shmem).I verified via debug logging that the crashing VMA is backed by `shmem_aops`. Looking at `mm/shmem.c`, tmpfs legitimately lacks a `.read_folio` implementation by design.
> It seems there are several "real" filesystems that can contain executables/libraries but lack `.read_folio`:
> 1. tmpfs/shmem
> 2. OverlayFS (delegates I/O)
> 3. DAX filesystems
> Given that this affects multiple filesystem types, handling them all correctly via `kernel_read` might be a larger scope than fixing the immediate crash. I worry about missing edge cases in tmpfs or OverlayFS if we try to implement the fallback immediately in this patch.
> > I really don't want the check for filler being NULL in read_cache_folio().
> > I want it to crash noisily if callers are doing something stupid.
> I propose the following approach for v3. It avoids the silent failure you are concerned about, but prevents the kernel panic:
>
> 1. Silent reject for `empty_aops` (procfs/sysfs), as they legitimately can't contain build IDs.
> 2. Loud warning + Error for other cases (DAX, tmpfs, OverlayFS).
>
Tbh, it seems a bit fragile to have to hard-code such file
system-specific logic in higher-level build ID fetching logic, where
all we really ask for from filemap_get_folio() + read_cache_folio()
combo is to give us requested piece of file or let us know (without
crashing) that this was not possible.
But if there is no way to abstract this away, then I think Shaurya
proposed with failing known-not-supported cases and warning on
unexpected ones would be a reasonable solution, I suppose. I see that
Matthew is discussing generalizing kernel_read, so maybe that will be
a better solution, let's see.
> The code would look like this:
>
> /* pseudo-filesystems */
> if (vma->vm_file->f_mapping->a_ops == &empty_aops)
> return -EINVAL;
>
> /* Real filesystems missing read_folio (DAX, tmpfs, OverlayFS, etc.) */
> if (!vma->vm_file->f_mapping->a_ops->read_folio) {
> /*
> * TODO: Implement kernel_read() fallback for DAX/tmpfs.
> * For now, fail loudly so we know what we are missing.
> */
> pr_warn_once("build_id_parse: filesystem %s lacks read_folio support\n",
> vma->vm_file->f_path.dentry->d_sb->s_type->name);
> return -EOPNOTSUPP;
> }
>
> This highlights exactly which filesystems are missing support in the logs without crashing the machine
> Thanks,
> Shaurya
Powered by blists - more mailing lists