[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20211215112852.GM14044@quack2.suse.cz>
Date: Wed, 15 Dec 2021 12:28:52 +0100
From: Jan Kara <jack@...e.cz>
To: "Darrick J. Wong" <djwong@...nel.org>
Cc: Luís Henriques <lhenriques@...e.de>,
Theodore Ts'o <tytso@....edu>,
Andreas Dilger <adilger.kernel@...ger.ca>,
Jan Kara <jack@...e.cz>, linux-ext4@...r.kernel.org,
linux-kernel@...r.kernel.org,
Jeroen van Wolffelaar <jeroen@...ffelaar.nl>
Subject: Re: [PATCH v2] ext4: set csum seed in tmp inode while migrating to
extents
On Tue 14-12-21 16:49:45, Darrick J. Wong wrote:
> On Tue, Dec 14, 2021 at 05:50:58PM +0000, Luís Henriques wrote:
> > When migrating to extents, the temporary inode will have it's own checksum
> > seed. This means that, when swapping the inodes data, the inode checksums
> > will be incorrect.
> >
> > This can be fixed by recalculating the extents checksums again. Or simply
> > by copying the seed into the temporary inode.
> >
> > Link: https://bugzilla.kernel.org/show_bug.cgi?id=213357
> > Reported-by: Jeroen van Wolffelaar <jeroen@...ffelaar.nl>
> > Signed-off-by: Luís Henriques <lhenriques@...e.de>
> > ---
> > fs/ext4/migrate.c | 12 +++++++++++-
> > 1 file changed, 11 insertions(+), 1 deletion(-)
> >
> > changes since v1:
> >
> > * Dropped tmp_ei variable
> > * ->i_csum_seed is now initialised immediately after tmp_inode is created
> > * New comment about the seed initialization and stating that recovery
> > needs to be fixed.
> >
> > Cheers,
> > --
> > Luís
> >
> > diff --git a/fs/ext4/migrate.c b/fs/ext4/migrate.c
> > index 7e0b4f81c6c0..36dfc88ce05b 100644
> > --- a/fs/ext4/migrate.c
> > +++ b/fs/ext4/migrate.c
> > @@ -459,6 +459,17 @@ int ext4_ext_migrate(struct inode *inode)
> > ext4_journal_stop(handle);
> > goto out_unlock;
> > }
> > + /*
> > + * Use the correct seed for checksum (i.e. the seed from 'inode'). This
> > + * is so that the metadata blocks will have the correct checksum after
> > + * the migration.
> > + *
> > + * Note however that, if a crash occurs during the migration process,
> > + * the recovery process is broken because the tmp_inode checksums will
> > + * be wrong and the orphans cleanup will fail.
>
> ...and then what does the user do?
Run fsck of course! And then recover from backups :) I know this is sad but
the situation is that our migration code just is not crash-safe (if we
crash we are going to free blocks that are still used by the migrated
inode) and Luis makes it work in case we do not crash (which should be
hopefully more common) and documents it does not work in case we crash.
So overall I'd call it a win.
But maybe we should just remove this online-migration functionality
completely from the kernel? That would be also a fine solution for me. I
was thinking whether we could somehow make the inode migration crash-safe
but I didn't think of anything which would not require on-disk format
change...
Honza
--
Jan Kara <jack@...e.com>
SUSE Labs, CR
Powered by blists - more mailing lists