lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <ZsKtllxojkTe3mpY@fedora>
Date: Mon, 19 Aug 2024 10:27:34 +0800
From: Ming Lei <ming.lei@...hat.com>
To: Muchun Song <songmuchun@...edance.com>
Cc: axboe@...nel.dk, linux-block@...r.kernel.org,
	linux-kernel@...r.kernel.org, ming.lei@...hat.com
Subject: Re: [PATCH 2/4] block: fix ordering between checking
 BLK_MQ_S_STOPPED and adding requests to hctx->dispatch

Hi Muchun,

On Sun, Aug 11, 2024 at 06:19:19PM +0800, Muchun Song wrote:
> Supposing the following scenario with a virtio_blk driver.
> 
> CPU0                                                                CPU1
> 
> blk_mq_try_issue_directly()
>     __blk_mq_issue_directly()
>         q->mq_ops->queue_rq()
>             virtio_queue_rq()
>                 blk_mq_stop_hw_queue()
>                                                                     virtblk_done()
>     blk_mq_request_bypass_insert()                                      blk_mq_start_stopped_hw_queues()
>         /* Add IO request to dispatch list */   1) store                    blk_mq_start_stopped_hw_queue()
>                                                                                 clear_bit(BLK_MQ_S_STOPPED)                 3) store
>     blk_mq_run_hw_queue()                                                       blk_mq_run_hw_queue()
>         if (!blk_mq_hctx_has_pending())                                             if (!blk_mq_hctx_has_pending())         4) load
>             return                                                                      return
>         blk_mq_sched_dispatch_requests()                                            blk_mq_sched_dispatch_requests()
>             if (blk_mq_hctx_stopped())          2) load                                 if (blk_mq_hctx_stopped())
>                 return                                                                      return
>             __blk_mq_sched_dispatch_requests()                                          __blk_mq_sched_dispatch_requests()
> 
> The full memory barrier should be inserted between 1) and 2), as well as between
> 3) and 4) to make sure that either CPU0 sees BLK_MQ_S_STOPPED is cleared or CPU1
> sees dispatch list or setting of bitmap of software queue. Otherwise, either CPU
> will not re-run the hardware queue causing starvation.

Yeah, it is one kind of race which is triggered when adding request into
->dispatch list after returning STS_RESOURCE. We were troubled by lots of
such kind of race.

stopping queue is used in very less drivers, and its only purpose should
be for throttling hw queue in case that low level queue is busy. There seems
more uses of blk_mq_stop_hw_queues(), but most of them should be replaced
with blk_mq_quiesce_queue().

IMO, fixing this kind of issue via memory barrier is too tricky to
maintain cause WRITE/READ dependency is very hard to follow. I'd suggest to
make memory barrier solution as the last resort, and we can try to figure
out other easier & more reliable way first.

One idea I thought of is to call blk_mq_request_bypass_insert()(or rename
& export it) before calling blk_mq_stop_hw_queue() in driver, then
return new status code STS_STOP_DISPATCH for notifying blk-mq to stop
dispatch simply.


thanks,
Ming


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ