lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <Yqg5QxSM+lub8DY0@T590>
Date:   Tue, 14 Jun 2022 15:31:15 +0800
From:   Ming Lei <ming.lei@...hat.com>
To:     Yu Kuai <yukuai3@...wei.com>
Cc:     axboe@...nel.dk, djeffery@...hat.com, bvanassche@....org,
        linux-block@...r.kernel.org, linux-kernel@...r.kernel.org,
        yi.zhang@...wei.com, ming.lei@...hat.com
Subject: Re: [PATCH -next] blk-mq: fix boot time regression for scsi drives
 with multiple hctx

On Tue, Jun 14, 2022 at 03:14:10PM +0800, Yu Kuai wrote:
> We found that boot time is increased for about 8s after upgrading kernel
> from v4.19 to v5.10(megaraid-sas is used in the environment).

But 'blk-mq: clearing flush request reference in tags->rqs[]' was merged
to v5.14, :-)

> 
> Following is where the extra time is spent:
> 
> scsi_probe_and_add_lun
>  __scsi_remove_device
>   blk_cleanup_queue
>    blk_mq_exit_queue
>     blk_mq_exit_hw_queues
>      blk_mq_exit_hctx
>       blk_mq_clear_flush_rq_mapping -> function latency is 0.1ms

So queue_depth looks pretty long, is it 4k?

But if it is 0.1ms, how can the 8sec delay be caused? That requires 80K hw queues
for making so long, so I guess there must be other delay added by the feature
of BLK_MQ_F_TAG_HCTX_SHARED.

>        cmpxchg
> 
> There are three reasons:
> 1) megaraid-sas is using multiple hctxs in v5.10, thus blk_mq_exit_hctx()
> will be called much more times in v5.10 compared to v4.19.
> 2) scsi will scan for each target thus __scsi_remove_device() will be
> called for many times.
> 3) blk_mq_clear_flush_rq_mapping() is introduced after v4.19, it will
> call cmpxchg() for each request, and function latency is abount 0.1ms.
> 
> Since that blk_mq_clear_flush_rq_mapping() will only be called while the
> queue is freezed already, which means there is no inflight request,
> it's safe to set NULL for 'tags->rqs[]' directly instead of using
> cmpxchg(). Tests show that with this change, function latency of
> blk_mq_clear_flush_rq_mapping() is about 1us, and boot time is not
> increased.

tags is shared among all LUNs attached to the host, so freezing single
request queue here means nothing, so your patch doesn't work.

Please test the following patch, and see if it can improve boot delay for
your case.

diff --git a/block/blk-mq.c b/block/blk-mq.c
index e9bf950983c7..1463076a527c 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -3443,8 +3443,9 @@ static void blk_mq_exit_hctx(struct request_queue *q,
 	if (blk_mq_hw_queue_mapped(hctx))
 		blk_mq_tag_idle(hctx);
 
-	blk_mq_clear_flush_rq_mapping(set->tags[hctx_idx],
-			set->queue_depth, flush_rq);
+	if (blk_queue_init_done(q))
+		blk_mq_clear_flush_rq_mapping(set->tags[hctx_idx],
+				set->queue_depth, flush_rq);
 	if (set->ops->exit_request)
 		set->ops->exit_request(set, flush_rq, hctx_idx);
 


Thanks,
Ming

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ