[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20211215141237.lrymhbebgjunh4n2@work>
Date: Wed, 15 Dec 2021 15:12:37 +0100
From: Lukas Czerner <lczerner@...hat.com>
To: Jan Kara <jack@...e.cz>
Cc: "Darrick J. Wong" <djwong@...nel.org>,
Luís Henriques <lhenriques@...e.de>,
Theodore Ts'o <tytso@....edu>,
Andreas Dilger <adilger.kernel@...ger.ca>,
linux-ext4@...r.kernel.org, linux-kernel@...r.kernel.org,
Jeroen van Wolffelaar <jeroen@...ffelaar.nl>
Subject: Re: [PATCH v2] ext4: set csum seed in tmp inode while migrating to
extents
On Wed, Dec 15, 2021 at 12:28:52PM +0100, Jan Kara wrote:
> On Tue 14-12-21 16:49:45, Darrick J. Wong wrote:
> > On Tue, Dec 14, 2021 at 05:50:58PM +0000, Luís Henriques wrote:
> > > When migrating to extents, the temporary inode will have it's own checksum
> > > seed. This means that, when swapping the inodes data, the inode checksums
> > > will be incorrect.
> > >
> > > This can be fixed by recalculating the extents checksums again. Or simply
> > > by copying the seed into the temporary inode.
> > >
> > > Link: https://bugzilla.kernel.org/show_bug.cgi?id=213357
> > > Reported-by: Jeroen van Wolffelaar <jeroen@...ffelaar.nl>
> > > Signed-off-by: Luís Henriques <lhenriques@...e.de>
> > > ---
> > > fs/ext4/migrate.c | 12 +++++++++++-
> > > 1 file changed, 11 insertions(+), 1 deletion(-)
> > >
> > > changes since v1:
> > >
> > > * Dropped tmp_ei variable
> > > * ->i_csum_seed is now initialised immediately after tmp_inode is created
> > > * New comment about the seed initialization and stating that recovery
> > > needs to be fixed.
> > >
> > > Cheers,
> > > --
> > > Luís
> > >
> > > diff --git a/fs/ext4/migrate.c b/fs/ext4/migrate.c
> > > index 7e0b4f81c6c0..36dfc88ce05b 100644
> > > --- a/fs/ext4/migrate.c
> > > +++ b/fs/ext4/migrate.c
> > > @@ -459,6 +459,17 @@ int ext4_ext_migrate(struct inode *inode)
> > > ext4_journal_stop(handle);
> > > goto out_unlock;
> > > }
> > > + /*
> > > + * Use the correct seed for checksum (i.e. the seed from 'inode'). This
> > > + * is so that the metadata blocks will have the correct checksum after
> > > + * the migration.
> > > + *
> > > + * Note however that, if a crash occurs during the migration process,
> > > + * the recovery process is broken because the tmp_inode checksums will
> > > + * be wrong and the orphans cleanup will fail.
> >
> > ...and then what does the user do?
>
> Run fsck of course! And then recover from backups :) I know this is sad but
> the situation is that our migration code just is not crash-safe (if we
> crash we are going to free blocks that are still used by the migrated
> inode) and Luis makes it work in case we do not crash (which should be
> hopefully more common) and documents it does not work in case we crash.
> So overall I'd call it a win.
>
> But maybe we should just remove this online-migration functionality
> completely from the kernel? That would be also a fine solution for me. I
> was thinking whether we could somehow make the inode migration crash-safe
> but I didn't think of anything which would not require on-disk format
> change...
Since this is not something that anyone can honestly recommend doing
without a prior backup and a word of warning I personaly would be in favor
of removing it.
-Lukas
>
> Honza
> --
> Jan Kara <jack@...e.com>
> SUSE Labs, CR
>
Powered by blists - more mailing lists