[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20221116152618.hznqamogp2gwpqtp@quack3>
Date:   Wed, 16 Nov 2022 16:26:18 +0100
From:   Jan Kara <jack@...e.cz>
To:     Baokun Li <libaokun1@...wei.com>
Cc:     Jan Kara <jack@...e.cz>, linux-ext4@...r.kernel.org, tytso@....edu,
        adilger.kernel@...ger.ca, ritesh.list@...il.com,
        linux-kernel@...r.kernel.org, yi.zhang@...wei.com,
        yukuai3@...wei.com
Subject: Re: [PATCH v2 2/3] ext4: fix corrupt backup group descriptors after
 online resize
On Wed 16-11-22 21:14:16, Baokun Li wrote:
> On 2022/11/16 19:49, Jan Kara wrote:
> > On Wed 16-11-22 15:28:01, Baokun Li wrote:
> > > In commit 9a8c5b0d0615 ("ext4: update the backup superblock's at the end
> > > of the online resize"), it is assumed that update_backups() only updates
> > > backup superblocks, so each b_data is treated as a backupsuper block to
> > > update its s_block_group_nr and s_checksum. However, update_backups()
> > > also updates the backup group descriptors, which causes the backup group
> > > descriptors to be corrupted.
> > > 
> > > The above commit fixes the problem of invalid checksum of the backup
> > > superblock. The root cause of this problem is that the checksum of
> > > ext4_update_super() is not set correctly. This problem has been fixed
> > > in the previous patch ("ext4: fix bad checksum after online resize").
> > > Therefore, roll back some modifications in the above commit.
> > > 
> > > Fixes: 9a8c5b0d0615 ("ext4: update the backup superblock's at the end of the online resize")
> > > Signed-off-by: Baokun Li <libaokun1@...wei.com>
> > So I agree commit 9a8c5b0d0615 is broken and does corrupt group
> > descriptors. However I don't see how PATCH 1/3 in this series would fix all
> > the problems commit 9a8c5b0d0615 is trying to fix. In particular checksums
> > on backup superblocks will not be properly set by the resize code AFAICT.
> > 
> > 								Honza
> I didn't find these two issues to be the same until I researched the problem
> in
> PATCH 3/3 and found that commit 9a8c5b0d0615 introduced a similar problem.
> Then, it is found that the backup superblock is directly copied from the
> primary
> superblock. If the backup superblock is faulty, the primary superblock must
> be
> faulty. In this case, patch 1 that fixes the primary superblock problem is
> thought
> of. So by rolling back commit 9a8c5b0d0615 to verify, I found that patch 1
> did
> fix the problem.
> 
> Only ext4_flex_group_add() and ext4_group_extend_no_check() call
> update_backups() to update the backup superblock. Both of these functions
> correctly set the checksum of the primary superblock. The backup superblocks
> that are copied from them are also correct.
> 
> In ext4_flex_group_add(), we only update the backup superblock if there are
> no
> previous errors, indicating that we must have updated the checksum in
> ext4_update_super() before executing update_backups(). The previous problem
> was that after we updated the checksum in ext4_update_super(), we modified
> s_overhead_clusters, so the checksums for both the primary and backup
> superblocks
> were incorrect. This problem has been fixed in PATCH 1/3, so checksum is set
> correctly in ext4_flex_group_add().
> 
> The same is true in ext4_group_extend_no_check(), we only update the backup
> superblock if there are no errors, and we execute ext4_superblock_csum_set()
> to update the checksum before updating the backup superblock. Therefore,
> checksum is correctly set in ext4_group_extend_no_check().
> 
> I think we only need to ensure that the checksum is set correctly when the
> buffer
> lock of sbi->s_sbh is unlocked. Therefore, the checksum should be correct
> before
> update_backups() holds the buffer lock. Also, in update_backups() we copy
> the
> entire superblock completely, and the checksum is unchanged, so we don't
> need
> to reset it.
So I agree the checksum should be matching but the backup superblock should
have also s_block_group_nr set properly and after updating that we need to
recalculate the checksum as well.
								Honza
> > > ---
> > >   fs/ext4/resize.c | 5 -----
> > >   1 file changed, 5 deletions(-)
> > > 
> > > diff --git a/fs/ext4/resize.c b/fs/ext4/resize.c
> > > index cb99b410c9fa..32fbfc173571 100644
> > > --- a/fs/ext4/resize.c
> > > +++ b/fs/ext4/resize.c
> > > @@ -1158,7 +1158,6 @@ static void update_backups(struct super_block *sb, sector_t blk_off, char *data,
> > >   	while (group < sbi->s_groups_count) {
> > >   		struct buffer_head *bh;
> > >   		ext4_fsblk_t backup_block;
> > > -		struct ext4_super_block *es;
> > >   		/* Out of journal space, and can't get more - abort - so sad */
> > >   		err = ext4_resize_ensure_credits_batch(handle, 1);
> > > @@ -1187,10 +1186,6 @@ static void update_backups(struct super_block *sb, sector_t blk_off, char *data,
> > >   		memcpy(bh->b_data, data, size);
> > >   		if (rest)
> > >   			memset(bh->b_data + size, 0, rest);
> > > -		es = (struct ext4_super_block *) bh->b_data;
> > > -		es->s_block_group_nr = cpu_to_le16(group);
> > > -		if (ext4_has_metadata_csum(sb))
> > > -			es->s_checksum = ext4_superblock_csum(sb, es);
> > >   		set_buffer_uptodate(bh);
> > >   		unlock_buffer(bh);
> > >   		err = ext4_handle_dirty_metadata(handle, NULL, bh);
> > > -- 
> > > 2.31.1
> > > 
> Thank you for your review!
> -- 
> With Best Regards,
> Baokun Li
> 
-- 
Jan Kara <jack@...e.com>
SUSE Labs, CR
Powered by blists - more mailing lists
 
