linux-kernel - Re: Known and unfixed active data loss bug in MM + XFS with large folios since Dec 2021 (any kernel from 6.1 upwards)

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <ZumDPU7RDg5wV0Re@casper.infradead.org>
Date: Tue, 17 Sep 2024 14:25:17 +0100
From: Matthew Wilcox <willy@...radead.org>
To: Chris Mason <clm@...a.com>
Cc: Linus Torvalds <torvalds@...ux-foundation.org>,
	Dave Chinner <david@...morbit.com>, Jens Axboe <axboe@...nel.dk>,
	Christian Theune <ct@...ingcircus.io>, linux-mm@...ck.org,
	"linux-xfs@...r.kernel.org" <linux-xfs@...r.kernel.org>,
	linux-fsdevel@...r.kernel.org, linux-kernel@...r.kernel.org,
	Daniel Dao <dqminh@...udflare.com>, regressions@...ts.linux.dev,
	regressions@...mhuis.info
Subject: Re: Known and unfixed active data loss bug in MM + XFS with large
 folios since Dec 2021 (any kernel from 6.1 upwards)

On Tue, Sep 17, 2024 at 01:13:05PM +0200, Chris Mason wrote:
> On 9/17/24 5:32 AM, Matthew Wilcox wrote:
> > On Mon, Sep 16, 2024 at 10:47:10AM +0200, Chris Mason wrote:
> >> I've got a bunch of assertions around incorrect folio->mapping and I'm
> >> trying to bash on the ENOMEM for readahead case.  There's a GFP_NOWARN
> >> on those, and our systems do run pretty short on ram, so it feels right
> >> at least.  We'll see.
> > 
> > I've been running with some variant of this patch the whole way across
> > the Atlantic, and not hit any problems.  But maybe with the right
> > workload ...?
> > 
> > There are two things being tested here.  One is whether we have a
> > cross-linked node (ie a node that's in two trees at the same time).
> > The other is whether the slab allocator is giving us a node that already
> > contains non-NULL entries.
> > 
> > If you could throw this on top of your kernel, we might stand a chance
> > of catching the problem sooner.  If it is one of these problems and not
> > something weirder.
> > 
> 
> This fires in roughly 10 seconds for me on top of v6.11.  Since array seems
> to always be 1, I'm not sure if the assertion is right, but hopefully you
> can trigger yourself.

Whoops.

$ git grep XA_RCU_FREE
lib/xarray.c:#define XA_RCU_FREE        ((struct xarray *)1)
lib/xarray.c:   node->array = XA_RCU_FREE;

so you walked into a node which is currently being freed by RCU.  Which
isn't a problem, of course.  I don't know why I do that; it doesn't seem
like anyone tests it.  The jetlag is seriously kicking in right now,
so I'm going to refrain from saying anything more because it probably
won't be coherent.