[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <Zr8YdAhw6tDqImzF@fedora>
Date: Fri, 16 Aug 2024 17:14:28 +0800
From: Ming Lei <ming.lei@...hat.com>
To: Muchun Song <songmuchun@...edance.com>
Cc: axboe@...nel.dk, linux-block@...r.kernel.org,
linux-kernel@...r.kernel.org
Subject: Re: [PATCH 1/4] block: fix request starvation when queue is stopped
or quiesced
On Sun, Aug 11, 2024 at 06:19:18PM +0800, Muchun Song wrote:
> Supposing the following scenario with a virtio_blk driver.
>
> CPU0 CPU1 CPU2
>
> blk_mq_try_issue_directly()
> __blk_mq_issue_directly()
> q->mq_ops->queue_rq()
> virtio_queue_rq()
> blk_mq_stop_hw_queue()
> blk_mq_try_issue_directly() virtblk_done()
> if (blk_mq_hctx_stopped())
> blk_mq_request_bypass_insert() blk_mq_start_stopped_hw_queue()
> blk_mq_run_hw_queue() blk_mq_run_hw_queue()
> blk_mq_insert_request()
> return // Who is responsible for dispatching this IO request?
>
> After CPU0 has marked the queue as stopped, CPU1 will see the queue is stopped.
> But before CPU1 puts the request on the dispatch list, CPU2 receives the interrupt
> of completion of request, so it will run the hardware queue and marks the queue
> as non-stopped. Meanwhile, CPU1 also runs the same hardware queue. After both CPU1
> and CPU2 complete blk_mq_run_hw_queue(), CPU1 just puts the request to the same
> hardware queue and returns. Seems it misses dispatching a request. Fix it by
> running the hardware queue explicitly. I think blk_mq_request_issue_directly()
> should handle a similar problem.
>
> Signed-off-by: Muchun Song <songmuchun@...edance.com>
> ---
> block/blk-mq.c | 2 ++
> 1 file changed, 2 insertions(+)
>
> diff --git a/block/blk-mq.c b/block/blk-mq.c
> index e3c3c0c21b553..b2d0f22de0c7f 100644
> --- a/block/blk-mq.c
> +++ b/block/blk-mq.c
> @@ -2619,6 +2619,7 @@ static void blk_mq_try_issue_directly(struct blk_mq_hw_ctx *hctx,
>
> if (blk_mq_hctx_stopped(hctx) || blk_queue_quiesced(rq->q)) {
> blk_mq_insert_request(rq, 0);
> + blk_mq_run_hw_queue(hctx, false);
> return;
> }
>
> @@ -2649,6 +2650,7 @@ static blk_status_t blk_mq_request_issue_directly(struct request *rq, bool last)
>
> if (blk_mq_hctx_stopped(hctx) || blk_queue_quiesced(rq->q)) {
> blk_mq_insert_request(rq, 0);
> + blk_mq_run_hw_queue(hctx, false);
> return BLK_STS_OK;
> }
Looks one real issue, and the fix is fine:
Reviewed-by: Ming Lei <ming.lei@...hat.com>
Thanks,
Ming
Powered by blists - more mailing lists