lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <24740a61-d379-b9b5-2e08-07f7a4597fa2@huawei.com>
Date: Mon, 11 Mar 2024 15:31:32 +0800
From: Zhihao Cheng <chengzhihao1@...wei.com>
To: <tytso@....edu>, <adilger.kernel@...ger.ca>
CC: <linux-ext4@...r.kernel.org>, <linux-kernel@...r.kernel.org>,
	<yi.zhang@...wei.com>
Subject: Re: [PATCH RFC] ext4: Validate inode pa before using preallocation
 blocks

在 2024/3/11 14:38, Zhihao Cheng 写道:
> In ext4 continue & no-journal mode, physical blocks could be allocated
> more than once (caused by writing extent entries failed & reclaiming
> extent cache) in preallocation process, which could trigger a BUG_ON
> (pa->pa_free < len) in ext4_mb_use_inode_pa().
> 
>   kernel BUG at fs/ext4/mballoc.c:4681!
>   invalid opcode: 0000 [#1] PREEMPT SMP
>   CPU: 3 PID: 97 Comm: kworker/u8:3 Not tainted 6.8.0-rc7
>   RIP: 0010:ext4_mb_use_inode_pa+0x1b6/0x1e0
>   Call Trace:
>    ext4_mb_use_preallocated.constprop.0+0x19e/0x540
>    ext4_mb_new_blocks+0x220/0x1f30
>    ext4_ext_map_blocks+0xf3c/0x2900
>    ext4_map_blocks+0x264/0xa40
>    ext4_do_writepages+0xb15/0x1400
>    do_writepages+0x8c/0x260
>    writeback_sb_inodes+0x224/0x720
>    wb_writeback+0xd8/0x580
>    wb_workfn+0x148/0x820
> 
> Details are shown as following:
> 
> 0. Given a file with i_size=4096 with one mapped block
> 1. Write block no 1, blocks 1~3 are preallocated.
>     ext4_ext_map_blocks
>      ext4_mb_normalize_request
>       size = 16 * 1024
>       size = end - start // Allocate 3 blocks (bs = 4096)
>      ext4_mb_regular_allocator
>       ext4_mb_regular_allocator
>       ext4_mb_regular_allocator
>       ext4_mb_use_inode_pa
>        pa->pa_free -= len // 3 - 1 = 2
> 2. Extent buffer head is written failed, es cache and buffer head are
>     reclaimed.
> 3. Write blocks 1~3
>     ext4_ext_map_blocks
>      newex.ee_len = 3
>      ext4_ext_check_overlap // Find nothing, there should have been block 1
>      allocated = map->m_len  // 3
>      ext4_mb_new_blocks
>       ext4_mb_use_preallocated
>        ext4_mb_use_inode_pa
>         BUG_ON(pa->pa_free < len) // 2 < 3!
> 
> Fix it by adding validation checking for inode pa. If invalid pa is
> detected, stop using inode preallocation, drop invalid pa to avoid it
> being used again, mark group block bitmap as corrupted to avoid allocating
> from the erroneous group.

After marking group block bitmap corrupted, mpage_map_and_submit_extent 
returns -EFSCORRUPTED from ext4_map_blocks -> ext4_ext_map_blocks -> 
ext4_mb_new_blocks -> ext4_mb_regular_allocator -> ext4_mb_find_by_goal 
-> ext4_mb_load_buddy -> ext4_mb_init_cache -> ext4_wait_block_bitmap -> 
  ext4_validate_block_bitmap-> EXT4_MB_GRP_BBITMAP_CORRUPT(grp).
I think the checking 'EXT4_MB_GRP_BBITMAP_CORRUPT(e4b->bd_info)' is not 
needed in ext4_mb_load_buddy, because all callers have checked it before 
using e4b. In this case(ext4_mb_regular_allocator), goal group could be 
skipped if it is corrupted, so ext4_mb_find_by_goal should load 
buddy(ext4_mb_load_buddy) without checking corrupted and then check 
corrupted with returning 0. But we can't delete the 
checking(EXT4_MB_GRP_BBITMAP_CORRUPT(grp)) directly from 
ext4_validate_block_bitmap, because some ext4_wait_block_bitmap callers 
may still need it. IOW, there are some logic pathes need the checking, 
but some don't need.

Above problem is independent with the problem solved by this patch, so I 
send out the patch.
> 
> Fetch a reproducer in Link.
> 
> Cc: stable@...r.kernel.org
> Link: https://bugzilla.kernel.org/show_bug.cgi?id=218576
> Signed-off-by: Zhihao Cheng <chengzhihao1@...wei.com>
> Signed-off-by: Zhang Yi <yi.zhang@...wei.com>
> ---
>   fs/ext4/mballoc.c | 128 +++++++++++++++++++++++++++++++++++-----------
>   1 file changed, 98 insertions(+), 30 deletions(-)
> 

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ