[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <Z-7BengoC1j6WQBE@casper.infradead.org>
Date: Thu, 3 Apr 2025 18:12:26 +0100
From: Matthew Wilcox <willy@...radead.org>
To: Matt Fleming <matt@...dmodwrite.com>
Cc: adilger.kernel@...ger.ca, akpm@...ux-foundation.org,
linux-ext4@...r.kernel.org, linux-fsdevel@...r.kernel.org,
linux-kernel@...r.kernel.org, linux-mm@...ck.org,
luka.2016.cs@...il.com, tytso@....edu,
Barry Song <baohua@...nel.org>, kernel-team@...udflare.com,
Vlastimil Babka <vbabka@...e.cz>,
Miklos Szeredi <miklos@...redi.hu>,
Amir Goldstein <amir73il@...il.com>,
Dave Chinner <david@...morbit.com>,
Qi Zheng <zhengqi.arch@...edance.com>,
Roman Gushchin <roman.gushchin@...ux.dev>,
Muchun Song <muchun.song@...ux.dev>
Subject: Re: Potential Linux Crash: WARNING in ext4_dirty_folio in Linux
kernel v6.13-rc5
On Thu, Apr 03, 2025 at 01:29:44PM +0100, Matt Fleming wrote:
> On Wed, Mar 26, 2025 at 10:59 AM Matt Fleming <matt@...dmodwrite.com> wrote:
> >
> > Hi there,
> >
> > I'm also seeing this PF_MEMALLOC WARN triggered from kswapd in 6.12.19.
> >
> > Does overlayfs need some kind of background inode reclaim support?
>
> Hey everyone, I know there was some off-list discussion last week at
> LSFMM, but I don't think a definite solution has been proposed for the
> below stacktrace.
Hi Matt,
We did have a substantial discussion at LSFMM and we just had another
discussion on the ext4 call. I'm going to try to summarise those
discussions here, and people can jump in to correct me (I'm not really
an expert on this part of MM-FS interaction).
At LSFMM, we came up with a solution that doesn't work, so let's start
with ideas that don't work:
- Allow PF_MEMALLOC to dip into the atomic reserves. With large block
devices, we might end up doing emergency high-order allocations, and
that makes everybody nervous
- Only allow inode reclaim from kswapd and not from direct reclaim.
Your stack trace here is from kswapd, so obviously that doesn't work.
- Allow ->evict_inode to return an error. At this point the inode has
been taken off the lists which means that somebody else may have
started to start constructing it again, and we can't just put it back
on the lists.
Jan explained that _usually_ the reclaim path is not the last holder of
a reference to the inode. What's happening here is that we've lost a
race where the dentry is being turned negative by somebody else at the
same time, and usually they'd have the last reference and call evict.
But if the shrinker has the last reference, it has to do the eviction.
Jan does not think that Overlayfs is a factor here. It may change the
timing somewhat but should not make the race wider (nor narrower).
Ideas still on the table:
- Convert all filesystems to use the XFS inode management scheme.
Nobody is thrilled by this large amount of work.
- Find a simpler version of the XFS scheme to implement for other
filesystems.
Powered by blists - more mailing lists