linux-ext4 - [PATCH 1/2] ext4: fix GDT corruption after online resizing with bigalloc enable and blocksize is 1024

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite for Android: free password hash cracker in your pocket

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20220817132701.3015912-2-libaokun1@huawei.com>
Date:   Wed, 17 Aug 2022 21:27:00 +0800
From:   Baokun Li <libaokun1@...wei.com>
To:     <linux-ext4@...r.kernel.org>
CC:     <tytso@....edu>, <adilger.kernel@...ger.ca>, <jack@...e.cz>,
        <ritesh.list@...il.com>, <lczerner@...hat.com>,
        <enwlinux@...il.com>, <linux-kernel@...r.kernel.org>,
        <yi.zhang@...wei.com>, <yebin10@...wei.com>, <yukuai3@...wei.com>,
        <libaokun1@...wei.com>
Subject: [PATCH 1/2] ext4: fix GDT corruption after online resizing with bigalloc enable and blocksize is 1024

When the backup superblock is updated in update_backups, the offset of
the current superblock in the group (that is, sbi->s_sbh->b_blocknr)
is used as the offset of the backup superblock in the group where the
backup superblock resides.

When blocksize==1024, sbi->s_sbh->b_blocknr is 1. Their block distribution
of groups is {0: 1-8192, 1:8193-16384...}, so the location of the backup
superblock is still the first block in each group.

If bigalloc is enabled at the same time, the block distribution of each
group changes to {0: 0-131071, 1:131072-262143...}. In this case,
update_backups overwrites the second block instead of the first block in
each group that contains backup superblocks with the current superblock.
As a result, both the backup superblock and the backup GDT are incorrect.
This is nothing, after all, the backup GDT is only used when the disk is
repaired.

However, in some cases, this may cause file system corruption, data loss,
and even some programs stuck in the D state. We can easily reproduce this
problem with the following commands:

  mkfs.ext4 -F -O ^resize_inode,^sparse_super,bigalloc -b 1024 /dev/sdb 4M
  mount /dev/sdb /tmp/test
  resize2fs /dev/sdb 4G

This is because the GDT for each meta_bg is placed in its first group. When
sparse_super is disabled, backup superblocks exist in each group. In this
case, the GDT of the new meta_bg obtained by online resizing is corrupt.

To solve this issue, we only need to specify the offset of the backup
superblock in the group to 0 when bigalloc is enabled.

Fixes: d77147ff443b ("ext4: add support for online resizing with bigalloc")
Signed-off-by: Baokun Li <libaokun1@...wei.com>
---
 fs/ext4/resize.c | 6 +++++-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/fs/ext4/resize.c b/fs/ext4/resize.c
index fea2a68d067b..0146a11efd06 100644
--- a/fs/ext4/resize.c
+++ b/fs/ext4/resize.c
@@ -1590,8 +1590,12 @@ static int ext4_flex_group_add(struct super_block *sb,
 				   EXT4_DESC_PER_BLOCK(sb));
 		int meta_bg = ext4_has_feature_meta_bg(sb);
 		sector_t old_gdb = 0;
+		sector_t blk_off = sbi->s_sbh->b_blocknr;

-		update_backups(sb, sbi->s_sbh->b_blocknr, (char *)es,
+		if (ext4_has_feature_bigalloc(sb))
+			blk_off = 0;
+
+		update_backups(sb, blk_off, (char *)es,
 			       sizeof(struct ext4_super_block), 0);
 		for (; gdb_num <= gdb_num_end; gdb_num++) {
 			struct buffer_head *gdb_bh;
-- 
2.31.1