[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20241018004427.GC3204734@mit.edu>
Date: Thu, 17 Oct 2024 20:44:27 -0400
From: "Theodore Ts'o" <tytso@....edu>
To: "Russell King (Oracle)" <linux@...linux.org.uk>
Cc: Andreas Dilger <adilger.kernel@...ger.ca>, linux-ext4@...r.kernel.org
Subject: Re: BUG: 6.10: ext4 mpage_process_page_bufs() BUG_ON triggers
On Fri, Oct 11, 2024 at 06:11:17PM +0100, Russell King (Oracle) wrote:
> I'm about to throw 6.11 on at least some of these VMs (including the
> two that failed) so the on-disk filesystem is going to be e2fscked
> shortly. As I said above, I don't think this is an ext4 issue, but
> something corrupting ext4 in-memory data structures.
I agree; I wonder if something would show up if KASAN were enabled.
If it is some array overrun or some other wild pointer problem in some
random subsystem that is enabled on your system (but not mine, since I
use a very restricted kernel config to speed up development builds),
maybe it will turn up something suspicious?
> It could be
> related to the VM having a relatively small amount of memory compared
> to modern standards (maybe adding MM pressure to tickle a bug in
> there).
Yeah, it would be interesting to add a test in xfstest which runs
fsstress and fio with memory pressure (where the memory shortage is
either memcg constraints or global memory availability).
> Or maybe we have another case of a tail-call optimisation
> gone wrong that corrupts a pointer causing ext4 in-memory data to
> be scribbled over.
Maybe; I assume you've tried to see if it rerouces on different
compiler / compiler versions?
> I'm grasping at straws at the moment though...
Ditto. :-)
- Ted
Powered by blists - more mailing lists