lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <aRsqwndQ459VN8I9@ranegod-HP-ENVY-x360-Convertible-13-bd0xxx>
Date: Mon, 17 Nov 2025 19:40:45 +0530
From: Shaurya Rane <ssrane_b23@...vjti.ac.in>
To: Matthew Wilcox <willy@...radead.org>
Cc: akpm@...ux-foundation.org, shakeel.butt@...ux.dev, eddyz87@...il.com,
	andrii@...nel.org, ast@...nel.org, linux-fsdevel@...r.kernel.org,
	linux-mm@...ck.org, linux-kernel@...r.kernel.org,
	linux-kernel-mentees@...ts.linux.dev, skhan@...uxfoundation.org,
	david.hunter.linux@...il.com, khalid@...nel.org,
	syzbot+09b7d050e4806540153d@...kaller.appspotmail.com
Subject: Re: [PATCH] mm/filemap: fix NULL pointer dereference in
 do_read_cache_folio()

On Sun, Nov 16, 2025 at 10:32:12PM +0000, Matthew Wilcox wrote:
> First, some process things ;-)
> 
> 1. Thank you for working on this.  Andrii has been ignoring it since
> August, which is bad.  So thank you for picking it up.
> 
> 2. Sending a v2 while we're having a discussion is generally a bad idea.
> It's fine to send a patch as a reply, but going as far as a v2 isn't
> necessary.  If conversation has died down, then a v2 is definitely
> warranted, but you and I are still having a discussion ;-)
> 
> 3. When you do send a v2 (or, now that you've sent a v2, send a v3),
> do it as a new thread rather then in reply to the v1 thread.  That plays
> better with the tooling we have like b4 which will pull in all patches
> in a thread.
>
Apologies for the process errors regarding the v2 submission. I appreciate the guidance on the workflow and threading; I will ensure the next version is sent as a clean, new thread once we have agreed on the technical solution.
> With that over with, on to the fun technical stuff.
> 
> On Sun, Nov 16, 2025 at 11:13:42AM +0530, SHAURYA RANE wrote:
> > On Sat, Nov 15, 2025 at 2:14 AM Matthew Wilcox <willy@...radead.org> wrote:
> > >
> > > On Sat, Nov 15, 2025 at 01:07:29AM +0530, ssrane_b23@...vjti.ac.in wrote:
> > > > When read_cache_folio() is called with a NULL filler function on a
> > > > mapping that does not implement read_folio, a NULL pointer
> > > > dereference occurs in filemap_read_folio().
> > > >
> > > > The crash occurs when:
> > > >
> > > > build_id_parse() is called on a VMA backed by a file from a
> > > > filesystem that does not implement ->read_folio() (e.g. procfs,
> > > > sysfs, or other virtual filesystems).
> > >
> > > Not a fan of this approach, to be honest.  This should be caught at
> > > a higher level.  In __build_id_parse(), there's already a check:
> > >
> > >         /* only works for page backed storage  */
> > >         if (!vma->vm_file)
> > >                 return -EINVAL;
> > >
> > > which is funny because the comment is correct, but the code is not.
> > > I suspect the right answer is to add right after it:
> > >
> > > +       if (vma->vm_file->f_mapping->a_ops == &empty_aops)
> > > +               return -EINVAL;
> > >
> > > Want to test that out?
> > Thanks for the suggestion.
> > Checking for
> >     a_ops == &empty_aops
> > is not enough. Certain filesystems for example XFS with DAX use
> > their own a_ops table (not empty_aops) but still do not implement
> > ->read_folio(). In those cases read_cache_folio() still ends up with
> > filler = NULL and filemap_read_folio(NULL) crashes.
> 
> Ah, right.  I had assumed that the only problem was synthetic
> filesystems like sysfs and procfs which can't have buildids because
> buildids only exist in executables.  And neither procfs nor sysfs
> contain executables.
> 
> But DAX is different.  You can absolutely put executables on a DAX
> filesystem.  So we shouldn't filter out DAX here.  And we definitely
> shouldn't *silently* fail for DAX.  Otherwise nobody will ever realise
> that the buildid people just couldn't be bothered to make DAX work.
> 
> I don't think it's necessarily all that hard to make buildid work
> for DAX.  It's probably something like:
> 
> 	if (IS_DAX(file_inode(file)))
> 		kernel_read(file, buf, count, &pos);
> 
> but that's just off the top of my head.
> 
> 
I agree that DAX needs proper support rather than silent filtering.
However, investigating the actual syzbot reproducer revealed that the issue extends beyond just DAX. The crash is actually triggering on tmpfs (shmem).I verified via debug logging that the crashing VMA is backed by `shmem_aops`. Looking at `mm/shmem.c`, tmpfs legitimately lacks a `.read_folio` implementation by design.
It seems there are several "real" filesystems that can contain executables/libraries but lack `.read_folio`:
1. tmpfs/shmem
2. OverlayFS (delegates I/O)
3. DAX filesystems
Given that this affects multiple filesystem types, handling them all correctly via `kernel_read` might be a larger scope than fixing the immediate crash. I worry about missing edge cases in tmpfs or OverlayFS if we try to implement the fallback immediately in this patch.
> I really don't want the check for filler being NULL in read_cache_folio().
> I want it to crash noisily if callers are doing something stupid.
I propose the following approach for v3. It avoids the silent failure you are concerned about, but prevents the kernel panic:

1. Silent reject for `empty_aops` (procfs/sysfs), as they legitimately can't contain build IDs.
2. Loud warning + Error for other cases (DAX, tmpfs, OverlayFS).

The code would look like this:

    /* pseudo-filesystems */
    if (vma->vm_file->f_mapping->a_ops == &empty_aops)
        return -EINVAL;

    /* Real filesystems missing read_folio (DAX, tmpfs, OverlayFS, etc.) */
    if (!vma->vm_file->f_mapping->a_ops->read_folio) {
        /*
         * TODO: Implement kernel_read() fallback for DAX/tmpfs.
         * For now, fail loudly so we know what we are missing.
         */
        pr_warn_once("build_id_parse: filesystem %s lacks read_folio support\n",
                     vma->vm_file->f_path.dentry->d_sb->s_type->name);
        return -EOPNOTSUPP;
    }

This highlights exactly which filesystems are missing support in the logs without crashing the machine
Thanks,
Shaurya

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ