lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <1931a9db-a81d-daf2-2e89-d1f183946618@redhat.com>
Date: Mon, 24 Feb 2025 13:53:47 +0100 (CET)
From: Mikulas Patocka <mpatocka@...hat.com>
To: Jinliang Zheng <alexjlzheng@...il.com>
cc: agk@...hat.com, snitzer@...nel.org, dm-devel@...ts.linux.dev, 
    linux-kernel@...r.kernel.org, linux-xfs@...r.kernel.org, 
    flyingpeng@...cent.com, txpeng@...cent.com, dchinner@...hat.com, 
    Jinliang Zheng <alexjlzheng@...cent.com>
Subject: Re: [PATCH] dm: fix unconditional IO throttle caused by
 REQ_PREFLUSH

Applied, thanks.

Mikulas



On Thu, 20 Feb 2025, Jinliang Zheng wrote:

> When a bio with REQ_PREFLUSH is submitted to dm, __send_empty_flush()
> generates a flush_bio with REQ_OP_WRITE | REQ_PREFLUSH | REQ_SYNC,
> which causes the flush_bio to be throttled by wbt_wait().
> 
> An example from v5.4, similar problem also exists in upstream:
> 
>     crash> bt 2091206
>     PID: 2091206  TASK: ffff2050df92a300  CPU: 109  COMMAND: "kworker/u260:0"
>      #0 [ffff800084a2f7f0] __switch_to at ffff80004008aeb8
>      #1 [ffff800084a2f820] __schedule at ffff800040bfa0c4
>      #2 [ffff800084a2f880] schedule at ffff800040bfa4b4
>      #3 [ffff800084a2f8a0] io_schedule at ffff800040bfa9c4
>      #4 [ffff800084a2f8c0] rq_qos_wait at ffff8000405925bc
>      #5 [ffff800084a2f940] wbt_wait at ffff8000405bb3a0
>      #6 [ffff800084a2f9a0] __rq_qos_throttle at ffff800040592254
>      #7 [ffff800084a2f9c0] blk_mq_make_request at ffff80004057cf38
>      #8 [ffff800084a2fa60] generic_make_request at ffff800040570138
>      #9 [ffff800084a2fae0] submit_bio at ffff8000405703b4
>     #10 [ffff800084a2fb50] xlog_write_iclog at ffff800001280834 [xfs]
>     #11 [ffff800084a2fbb0] xlog_sync at ffff800001280c3c [xfs]
>     #12 [ffff800084a2fbf0] xlog_state_release_iclog at ffff800001280df4 [xfs]
>     #13 [ffff800084a2fc10] xlog_write at ffff80000128203c [xfs]
>     #14 [ffff800084a2fcd0] xlog_cil_push at ffff8000012846dc [xfs]
>     #15 [ffff800084a2fda0] xlog_cil_push_work at ffff800001284a2c [xfs]
>     #16 [ffff800084a2fdb0] process_one_work at ffff800040111d08
>     #17 [ffff800084a2fe00] worker_thread at ffff8000401121cc
>     #18 [ffff800084a2fe70] kthread at ffff800040118de4
> 
> After commit 2def2845cc33 ("xfs: don't allow log IO to be throttled"),
> the metadata submitted by xlog_write_iclog() should not be throttled.
> But due to the existence of the dm layer, throttling flush_bio indirectly
> causes the metadata bio to be throttled.
> 
> Fix this by conditionally adding REQ_IDLE to flush_bio.bi_opf, which makes
> wbt_should_throttle() return false to avoid wbt_wait().
> 
> Signed-off-by: Jinliang Zheng <alexjlzheng@...cent.com>
> Reviewed-by: Tianxiang Peng <txpeng@...cent.com>
> Reviewed-by: Hao Peng <flyingpeng@...cent.com>
> ---
>  drivers/md/dm.c | 8 ++++++--
>  1 file changed, 6 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/md/dm.c b/drivers/md/dm.c
> index 4d1e42891d24..5ab7574c0c76 100644
> --- a/drivers/md/dm.c
> +++ b/drivers/md/dm.c
> @@ -1540,14 +1540,18 @@ static void __send_empty_flush(struct clone_info *ci)
>  {
>  	struct dm_table *t = ci->map;
>  	struct bio flush_bio;
> +	blk_opf_t opf = REQ_OP_WRITE | REQ_PREFLUSH | REQ_SYNC;
> +
> +	if ((ci->io->orig_bio->bi_opf & (REQ_IDLE | REQ_SYNC)) ==
> +	    (REQ_IDLE | REQ_SYNC))
> +		opf |= REQ_IDLE;
>  
>  	/*
>  	 * Use an on-stack bio for this, it's safe since we don't
>  	 * need to reference it after submit. It's just used as
>  	 * the basis for the clone(s).
>  	 */
> -	bio_init(&flush_bio, ci->io->md->disk->part0, NULL, 0,
> -		 REQ_OP_WRITE | REQ_PREFLUSH | REQ_SYNC);
> +	bio_init(&flush_bio, ci->io->md->disk->part0, NULL, 0, opf);
>  
>  	ci->bio = &flush_bio;
>  	ci->sector_count = 0;
> -- 
> 2.41.1
> 


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ