[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <YbjKbqsVdK3LzKMm@suse.de>
Date: Tue, 14 Dec 2021 16:46:38 +0000
From: Luís Henriques <lhenriques@...e.de>
To: Jan Kara <jack@...e.cz>
Cc: Theodore Ts'o <tytso@....edu>,
Andreas Dilger <adilger.kernel@...ger.ca>,
linux-ext4@...r.kernel.org, linux-kernel@...r.kernel.org,
Jeroen van Wolffelaar <jeroen@...ffelaar.nl>
Subject: Re: [PATCH] ext4: set csum seed in tmp inode while migrating to
extents
On Tue, Dec 14, 2021 at 01:03:17PM +0100, Jan Kara wrote:
> On Mon 06-12-21 14:37:33, Luís Henriques wrote:
> > When migrating to extents, the temporary inode will have it's own checksum
> > seed. This means that, when swapping the inodes data, the inode checksums
> > will be incorrect.
> >
> > This can be fixed by recalculating the extents checksums again. Or simply
> > by copying the seed into the temporary inode.
> >
> > Link: https://bugzilla.kernel.org/show_bug.cgi?id=213357
> > Reported-by: Jeroen van Wolffelaar <jeroen@...ffelaar.nl>
> > Signed-off-by: Luís Henriques <lhenriques@...e.de>
>
> Thanks for debugging this! Two comments below:
And thanks for the review!
> > diff --git a/fs/ext4/migrate.c b/fs/ext4/migrate.c
> > index 7e0b4f81c6c0..dd4ece38fc83 100644
> > --- a/fs/ext4/migrate.c
> > +++ b/fs/ext4/migrate.c
> > @@ -413,7 +413,7 @@ int ext4_ext_migrate(struct inode *inode)
> > handle_t *handle;
> > int retval = 0, i;
> > __le32 *i_data;
> > - struct ext4_inode_info *ei;
> > + struct ext4_inode_info *ei, *tmp_ei;
>
> Probably no need for the new tmp_ei variable when you use it only once...
Sure, I'll drop that new variable in v2.
> > @@ -503,6 +503,10 @@ int ext4_ext_migrate(struct inode *inode)
> > }
> >
> > ei = EXT4_I(inode);
> > + tmp_ei = EXT4_I(tmp_inode);
> > + /* Use the right seed for checksumming */
> > + tmp_ei->i_csum_seed = ei->i_csum_seed;
> > +
>
> I think this is subtly broken in another way: If we crash in the middle of
> migration, tmp_inode (and possibly attached extent tree blocks) will have
> wrong checksums (remember that i_csum_seed is computed from inode number)
> and so orphan cleanup will fail. On the other hand in that case the orphan
> cleanup will free blocks we have already managed to attach to the tmp_inode
> although they are still properly attached to the old 'inode'. So the
> recovery from a crash in the middle of the migration seems to be broken
> anyway. So I guess what you do is an improvement. But can you perhaps:
>
> 1) Move i_csum_seed initialization to a bit earlier in ext4_ext_migrate()
> just after we have got the tmp_inode from ext4_new_inode()? That way all
> inode writes will at least happen with the same csum.
>
> 2) Add a comment you are updating the csum seed so that metadata blocks get
> proper checksum for 'inode' and that recovery from a crash in the middle of
> migration is currently broken.
Obviously, I did not realize the recovery process was broken and I
appreciate you took the time to explain _how_ it is broken. I'll add a
new item to (the bottom of) my to-do list and maybe one of these days I
get to look into it.
I'll send out v2 shortly, implementing your suggestions.
Cheers,
--
Luís
>
> Thanks!
>
> Honza
> --
> Jan Kara <jack@...e.com>
> SUSE Labs, CR
Powered by blists - more mailing lists