linux-ext4 - Re: [REPORT] kernel BUG at fs/ext4/inode.c:2620

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite for Android: free password hash cracker in your pocket

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20220224014842.GM59715@dread.disaster.area>
Date:   Thu, 24 Feb 2022 12:48:42 +1100
From:   Dave Chinner <david@...morbit.com>
To:     Theodore Ts'o <tytso@....edu>
Cc:     Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
        John Hubbard <jhubbard@...dia.com>,
        Lee Jones <lee.jones@...aro.org>, linux-ext4@...r.kernel.org,
        Christoph Hellwig <hch@....de>,
        Dave Chinner <dchinner@...hat.com>,
        Goldwyn Rodrigues <rgoldwyn@...e.com>,
        "Darrick J . Wong" <darrick.wong@...cle.com>,
        Bob Peterson <rpeterso@...hat.com>,
        Damien Le Moal <damien.lemoal@....com>,
        Andreas Gruenbacher <agruenba@...hat.com>,
        Ritesh Harjani <riteshh@...ux.ibm.com>,
        Johannes Thumshirn <jth@...nel.org>, linux-xfs@...r.kernel.org,
        linux-fsdevel@...r.kernel.org, cluster-devel@...hat.com,
        linux-kernel@...r.kernel.org
Subject: Re: [REPORT] kernel BUG at fs/ext4/inode.c:2620 - page_buffers()

On Wed, Feb 23, 2022 at 06:35:54PM -0500, Theodore Ts'o wrote:
> On Fri, Feb 18, 2022 at 08:51:54AM +0100, Greg Kroah-Hartman wrote:
> > > The challenge is that fixing this "the right away" is probably not
> > > something we can backport into an LTS kernel, whether it's 5.15 or
> > > 5.10... or 4.19.
> > 
> > Don't worry about stable backports to start with.  Do it the "right way"
> > first and then we can consider if it needs to be backported or not.
> 
> Fair enough; on the other hand, we could also view this as making ext4
> more robust against buggy code in other subsystems, and while other
> file systems may be losing user data if they are actually trying to do
> remote memory access to file-backed memory, apparently other file
> systems aren't noticing and so they're not crashing.

Oh, we've noticed them, no question about that.  We've got bug
reports going back years for systems being crashed, triggering BUGs
and/or corrupting data on both XFS and ext4 filesystems due to users
trying to run RDMA applications with file backed pages.

Most of the people doing this now know that we won't support such
applications until the RDMA stack/hardware can trigger on-demand
write page faults the same way CPUs do when they first write to a
clean page. They don't have this, so mostly these people don't
bother reporting these class of problems to us anymore.  The
gup/RDMA infrastructure to make this all work is slowly moving
forwards, but it's not here yet.

> Issuing a
> warning and then not crashing is arguably a better way for ext4 to
> react, especially if there are other parts of the kernel that are
> randomly calling set_page_dirty() on file-backed memory without
> properly first informing the file system in a context where it can
> block and potentially do I/O to do things like allocate blocks.

I'm not sure that replacing the BUG() with a warning is good enough
- it's still indicative of an application doing something dangerous
that could result in silent data corruption and/or other problems.

Cheers,

Dave.
-- 
Dave Chinner
david@...morbit.com