[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAPcyv4ihuErfVWHL0F1OExQashutJjBdaLn5X5oPm44OkQ+a_A@mail.gmail.com>
Date: Thu, 17 Jun 2021 00:04:14 -0700
From: Dan Williams <dan.j.williams@...el.com>
To: "ruansy.fnst@...itsu.com" <ruansy.fnst@...itsu.com>
Cc: Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
linux-xfs <linux-xfs@...r.kernel.org>,
Linux MM <linux-mm@...ck.org>,
linux-fsdevel <linux-fsdevel@...r.kernel.org>,
device-mapper development <dm-devel@...hat.com>,
"Darrick J. Wong" <darrick.wong@...cle.com>,
david <david@...morbit.com>, Christoph Hellwig <hch@....de>,
Alasdair Kergon <agk@...hat.com>,
Mike Snitzer <snitzer@...hat.com>,
Goldwyn Rodrigues <rgoldwyn@...e.de>,
Linux NVDIMM <nvdimm@...ts.linux.dev>
Subject: Re: [PATCH v4 03/10] fs: Introduce ->corrupted_range() for superblock
On Wed, Jun 16, 2021 at 11:51 PM ruansy.fnst@...itsu.com
<ruansy.fnst@...itsu.com> wrote:
>
> > -----Original Message-----
> > From: Dan Williams <dan.j.williams@...el.com>
> > Subject: Re: [PATCH v4 03/10] fs: Introduce ->corrupted_range() for superblock
> >
> > [ drop old linux-nvdimm@...ts.01.org, add nvdimm@...ts.linux.dev ]
> >
> > On Thu, Jun 3, 2021 at 6:19 PM Shiyang Ruan <ruansy.fnst@...itsu.com> wrote:
> > >
> > > Memory failure occurs in fsdax mode will finally be handled in
> > > filesystem. We introduce this interface to find out files or metadata
> > > affected by the corrupted range, and try to recover the corrupted data
> > > if possiable.
> > >
> > > Signed-off-by: Shiyang Ruan <ruansy.fnst@...itsu.com>
> > > ---
> > > include/linux/fs.h | 2 ++
> > > 1 file changed, 2 insertions(+)
> > >
> > > diff --git a/include/linux/fs.h b/include/linux/fs.h index
> > > c3c88fdb9b2a..92af36c4225f 100644
> > > --- a/include/linux/fs.h
> > > +++ b/include/linux/fs.h
> > > @@ -2176,6 +2176,8 @@ struct super_operations {
> > > struct shrink_control *);
> > > long (*free_cached_objects)(struct super_block *,
> > > struct shrink_control *);
> > > + int (*corrupted_range)(struct super_block *sb, struct block_device
> > *bdev,
> > > + loff_t offset, size_t len, void *data);
> >
> > Why does the superblock need a new operation? Wouldn't whatever function is
> > specified here just be specified to the dax_dev as the
> > ->notify_failure() holder callback?
>
> Because we need to find out which file is effected by the given poison page so that memory-failure code can do collect_procs() and kill_procs() jobs. And it needs filesystem to use its rmap feature to search the file from a given offset. So, we need this implemented by the specified filesystem and called by dax_device's holder.
>
> This is the call trace I described in cover letter:
> memory_failure()
> * fsdax case
> pgmap->ops->memory_failure() => pmem_pgmap_memory_failure()
> dax_device->holder_ops->corrupted_range() =>
> - fs_dax_corrupted_range()
> - md_dax_corrupted_range()
> sb->s_ops->currupted_range() => xfs_fs_corrupted_range() <== **HERE**
> xfs_rmap_query_range()
> xfs_currupt_helper()
> * corrupted on metadata
> try to recover data, call xfs_force_shutdown()
> * corrupted on file data
> try to recover data, call mf_dax_kill_procs()
> * normal case
> mf_generic_kill_procs()
>
> As you can see, this new added operation is an important for the whole progress.
I don't think you need either fs_dax_corrupted_range() nor
sb->s_ops->corrupted_range(). In fact that fs_dax_corrupted_range()
looks broken because the filesystem may not even be mounted on the
device associated with the error. The holder_data and holder_op should
be sufficient from communicating the stack of notifications:
pgmap->notify_memory_failure() => pmem_pgmap_notify_failure()
pmem_dax_dev->holder_ops->notify_failure(pmem_dax_dev) =>
md_dax_notify_failure()
md_dax_dev->holder_ops->notify_failure() => xfs_notify_failure()
I.e. the entire chain just walks dax_dev holder ops.
Powered by blists - more mailing lists