[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <YbshxgbAnt7rupQG@suse.de>
Date: Thu, 16 Dec 2021 11:23:50 +0000
From: Luís Henriques <lhenriques@...e.de>
To: Lukas Czerner <lczerner@...hat.com>
Cc: Jan Kara <jack@...e.cz>, "Darrick J. Wong" <djwong@...nel.org>,
Theodore Ts'o <tytso@....edu>,
Andreas Dilger <adilger.kernel@...ger.ca>,
linux-ext4@...r.kernel.org, linux-kernel@...r.kernel.org,
Jeroen van Wolffelaar <jeroen@...ffelaar.nl>
Subject: Re: [PATCH v2] ext4: set csum seed in tmp inode while migrating to
extents
On Wed, Dec 15, 2021 at 03:12:37PM +0100, Lukas Czerner wrote:
> On Wed, Dec 15, 2021 at 12:28:52PM +0100, Jan Kara wrote:
> > On Tue 14-12-21 16:49:45, Darrick J. Wong wrote:
> > > On Tue, Dec 14, 2021 at 05:50:58PM +0000, Luís Henriques wrote:
> > > > When migrating to extents, the temporary inode will have it's own checksum
> > > > seed. This means that, when swapping the inodes data, the inode checksums
> > > > will be incorrect.
> > > >
> > > > This can be fixed by recalculating the extents checksums again. Or simply
> > > > by copying the seed into the temporary inode.
> > > >
> > > > Link: https://bugzilla.kernel.org/show_bug.cgi?id=213357
> > > > Reported-by: Jeroen van Wolffelaar <jeroen@...ffelaar.nl>
> > > > Signed-off-by: Luís Henriques <lhenriques@...e.de>
> > > > ---
> > > > fs/ext4/migrate.c | 12 +++++++++++-
> > > > 1 file changed, 11 insertions(+), 1 deletion(-)
> > > >
> > > > changes since v1:
> > > >
> > > > * Dropped tmp_ei variable
> > > > * ->i_csum_seed is now initialised immediately after tmp_inode is created
> > > > * New comment about the seed initialization and stating that recovery
> > > > needs to be fixed.
> > > >
> > > > Cheers,
> > > > --
> > > > Luís
> > > >
> > > > diff --git a/fs/ext4/migrate.c b/fs/ext4/migrate.c
> > > > index 7e0b4f81c6c0..36dfc88ce05b 100644
> > > > --- a/fs/ext4/migrate.c
> > > > +++ b/fs/ext4/migrate.c
> > > > @@ -459,6 +459,17 @@ int ext4_ext_migrate(struct inode *inode)
> > > > ext4_journal_stop(handle);
> > > > goto out_unlock;
> > > > }
> > > > + /*
> > > > + * Use the correct seed for checksum (i.e. the seed from 'inode'). This
> > > > + * is so that the metadata blocks will have the correct checksum after
> > > > + * the migration.
> > > > + *
> > > > + * Note however that, if a crash occurs during the migration process,
> > > > + * the recovery process is broken because the tmp_inode checksums will
> > > > + * be wrong and the orphans cleanup will fail.
> > >
> > > ...and then what does the user do?
> >
> > Run fsck of course! And then recover from backups :) I know this is sad but
> > the situation is that our migration code just is not crash-safe (if we
> > crash we are going to free blocks that are still used by the migrated
> > inode) and Luis makes it work in case we do not crash (which should be
> > hopefully more common) and documents it does not work in case we crash.
> > So overall I'd call it a win.
> >
> > But maybe we should just remove this online-migration functionality
> > completely from the kernel? That would be also a fine solution for me. I
> > was thinking whether we could somehow make the inode migration crash-safe
> > but I didn't think of anything which would not require on-disk format
> > change...
>
> Since this is not something that anyone can honestly recommend doing
> without a prior backup and a word of warning I personaly would be in favor
> of removing it.
BTW, in case migration is kept in the kernel (even with the broken
recovery), I think it's worth turning this bug reproducer into an ext4
fstest. I was planning to do so, but I'd rather wait to see if the effort
is worthwhile (i.e. if migration is kept or not).
Cheers,
--
Luís
Powered by blists - more mailing lists