lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20220831103610.mlucilhxbho4ry25@quack3>
Date:   Wed, 31 Aug 2022 12:36:10 +0200
From:   Jan Kara <jack@...e.cz>
To:     Zhang Yi <yi.zhang@...wei.com>
Cc:     linux-ext4@...r.kernel.org, tytso@....edu,
        adilger.kernel@...ger.ca, jack@...e.cz, openglfreak@...glemail.com,
        yukuai3@...wei.com
Subject: Re: [PATCH] ext4: ext4_read_bh_lock() should submit IO if the buffer
 isn't uptodate

On Wed 31-08-22 15:46:29, Zhang Yi wrote:
> Recently we notice that ext4 filesystem occasionally fail to read
> metadata from disk and report error message, but the disk and block
> layer looks fine. After analyse, we lockon commit 88dbcbb3a484
> ("blkdev: avoid migration stalls for blkdev pages"). It provide a
> migration method for the bdev, we could move page that has buffers
> without extra users now, but it lock the buffers on the page, which
> breaks the fragile metadata read operation on ext4 filesystem,
> ext4_read_bh_lock() was copied from ll_rw_block(), it depends on the
> assumption of that locked buffer means it is under IO. So it just
> trylock the buffer and skip submit IO if it lock failed, after
> wait_on_buffer() we conclude IO error because the buffer is not
> uptodate.
> 
> This issue could be easily reproduced by add some delay just after
> buffer_migrate_lock_buffers() in __buffer_migrate_folio() and do
> fsstress on ext4 filesystem.
> 
>   EXT4-fs error (device pmem1): __ext4_find_entry:1658: inode #73193:
>   comm fsstress: reading directory lblock 0
>   EXT4-fs error (device pmem1): __ext4_find_entry:1658: inode #75334:
>   comm fsstress: reading directory lblock 0
> 
> Fix it by removing the trylock logic in ext4_read_bh_lock(), just lock
> the buffer and submit IO if it's not uptodate, and also leave over
> readahead helper.
> 
> Signed-off-by: Zhang Yi <yi.zhang@...wei.com>

Thanks for the fix. It looks good to me. Feel free to add:

Reviewed-by: Jan Kara <jack@...e.cz>

								Honza

> ---
>  fs/ext4/super.c | 16 +++++-----------
>  1 file changed, 5 insertions(+), 11 deletions(-)
> 
> diff --git a/fs/ext4/super.c b/fs/ext4/super.c
> index 9a66abcca1a8..629bba3fa99a 100644
> --- a/fs/ext4/super.c
> +++ b/fs/ext4/super.c
> @@ -205,19 +205,12 @@ int ext4_read_bh(struct buffer_head *bh, blk_opf_t op_flags, bh_end_io_t *end_io
>  
>  int ext4_read_bh_lock(struct buffer_head *bh, blk_opf_t op_flags, bool wait)
>  {
> -	if (trylock_buffer(bh)) {
> -		if (wait)
> -			return ext4_read_bh(bh, op_flags, NULL);
> +	lock_buffer(bh);
> +	if (!wait) {
>  		ext4_read_bh_nowait(bh, op_flags, NULL);
>  		return 0;
>  	}
> -	if (wait) {
> -		wait_on_buffer(bh);
> -		if (buffer_uptodate(bh))
> -			return 0;
> -		return -EIO;
> -	}
> -	return 0;
> +	return ext4_read_bh(bh, op_flags, NULL);
>  }
>  
>  /*
> @@ -264,7 +257,8 @@ void ext4_sb_breadahead_unmovable(struct super_block *sb, sector_t block)
>  	struct buffer_head *bh = sb_getblk_gfp(sb, block, 0);
>  
>  	if (likely(bh)) {
> -		ext4_read_bh_lock(bh, REQ_RAHEAD, false);
> +		if (trylock_buffer(bh))
> +			ext4_read_bh_nowait(bh, REQ_RAHEAD, NULL);
>  		brelse(bh);
>  	}
>  }
> -- 
> 2.31.1
> 
-- 
Jan Kara <jack@...e.com>
SUSE Labs, CR

Powered by blists - more mailing lists