[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20220627220706.GE227878@dread.disaster.area>
Date: Tue, 28 Jun 2022 08:07:06 +1000
From: Dave Chinner <david@...morbit.com>
To: "Darrick J. Wong" <djwong@...nel.org>
Cc: "Matthew Wilcox (Oracle)" <willy@...radead.org>,
linux-xfs@...r.kernel.org, linux-fsdevel@...r.kernel.org,
linux-kernel@...r.kernel.org, Christoph Hellwig <hch@....de>
Subject: Re: [PATCH v3 25/25] xfs: Support large folios
On Sun, Jun 26, 2022 at 09:15:27PM -0700, Darrick J. Wong wrote:
> On Wed, Jun 22, 2022 at 05:42:11PM -0700, Darrick J. Wong wrote:
> > [resend with shorter 522.out file to keep us under the 300k maximum]
> >
> > On Thu, Dec 16, 2021 at 09:07:15PM +0000, Matthew Wilcox (Oracle) wrote:
> > > Now that iomap has been converted, XFS is large folio safe.
> > > Indicate to the VFS that it can now create large folios for XFS.
> > >
> > > Signed-off-by: Matthew Wilcox (Oracle) <willy@...radead.org>
> > > Reviewed-by: Christoph Hellwig <hch@....de>
> > > Reviewed-by: Darrick J. Wong <djwong@...nel.org>
> > > ---
> > > fs/xfs/xfs_icache.c | 2 ++
> > > 1 file changed, 2 insertions(+)
> > >
> > > diff --git a/fs/xfs/xfs_icache.c b/fs/xfs/xfs_icache.c
> > > index da4af2142a2b..cdc39f576ca1 100644
> > > --- a/fs/xfs/xfs_icache.c
> > > +++ b/fs/xfs/xfs_icache.c
> > > @@ -87,6 +87,7 @@ xfs_inode_alloc(
> > > /* VFS doesn't initialise i_mode or i_state! */
> > > VFS_I(ip)->i_mode = 0;
> > > VFS_I(ip)->i_state = 0;
> > > + mapping_set_large_folios(VFS_I(ip)->i_mapping);
> > >
> > > XFS_STATS_INC(mp, vn_active);
> > > ASSERT(atomic_read(&ip->i_pincount) == 0);
> > > @@ -320,6 +321,7 @@ xfs_reinit_inode(
> > > inode->i_rdev = dev;
> > > inode->i_uid = uid;
> > > inode->i_gid = gid;
> > > + mapping_set_large_folios(inode->i_mapping);
> >
> > Hmm. Ever since 5.19-rc1, I've noticed that fsx in generic/522 now
> > reports file corruption after 20 minutes of runtime. The corruption is
> > surprisingly reproducible (522.out.bad attached below) in that I ran it
> > three times and always got the same bad offset (0x6e000) and always the
> > same opcode (6213798(166 mod 256) MAPREAD).
> >
> > I turned off multipage folios and now 522 has run for over an hour
> > without problems, so before I go do more debugging, does this ring a
> > bell to anyone?
>
> I tried bisecting, but that didn't yield anything productive and
> 5.19-rc4 still fails after 25 minutes; however, it seems that g/522 will
> run without problems for at least 3-4 days after reverting this patch
> from -rc3.
Took 63 million ops and just over 3 hours before it failed here with
a similar 16 byte map read corruption on the first 16 bytes of a
page. Given the number of fallocate operations that lead up to the
failure - 14 of last 23, plus 3 clone, 2 copy, 2 map read, 1 skip
and the map write that it suggests the stale data came from - this
smells of an invalidation issue...
Cheers,
Dave.
--
Dave Chinner
david@...morbit.com
Powered by blists - more mailing lists