[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <af65c545-a108-4045-874e-c67f894a3235@grimberg.me>
Date: Sat, 28 Jun 2025 13:13:42 +0300
From: Sagi Grimberg <sagi@...mberg.me>
To: Yu Kuai <yukuai1@...weicloud.com>, kbusch@...nel.org, axboe@...nel.dk,
hch@....de
Cc: yi.zhang@...hat.com, linux-nvme@...ts.infradead.org,
linux-kernel@...r.kernel.org, yukuai3@...wei.com, yi.zhang@...wei.com,
yangerkun@...wei.com, johnny.chenyi@...wei.com
Subject: Re: [PATCH] nvme: clear nvme request for nonready request
First, we need change the patch title to clarify that it fixes a bug.
i.e. something like:
nvme: fix nvme-mpath misaccounting of inflight active IO
Second, we need to add a fixes tag (i.e. addition of nvme-mpath
nr_active accounting)
Third, we need a code-comment that explains this subtlety because it is
not trivial.
On 28/06/2025 9:46, Yu Kuai wrote:
> From: Yu Kuai <yukuai3@...wei.com>
>
> It's found nvme mpath IO inflight counter can be decreased to negtive by
> following stack:
>
> CPU: 12 UID: 0 PID: 466 Comm: kworker/12:1H Tainted: G
> 6.16.0-rc3.yu+ #2 PREEMPT(voluntary)
> Workqueue: kblockd blk_mq_run_work_fn
> RIP: 0010:bdev_end_io_acct+0x494/0x5c0
> Call Trace:
> <TASK>
> nvme_end_req+0x4d/0x70 [nvme_core]
> nvme_failover_req+0x3bd/0x530 [nvme_core]
> nvme_fail_nonready_command+0x12c/0x170 [nvme_core]
> nvme_fc_queue_rq+0x463/0x720 [nvme_fc]
> blk_mq_dispatch_rq_list+0x358/0x1260
> __blk_mq_sched_dispatch_requests+0x2dd/0x480
> blk_mq_sched_dispatch_requests+0xa6/0x140
> blk_mq_run_work_fn+0x1bb/0x2a0
> process_one_work+0x8ca/0x1950
> worker_thread+0x58d/0xcf0
> kthread+0x3d5/0x7a0
> ret_from_fork+0x403/0x510
> ret_from_fork_asm+0x1a/0x30
> </TASK>
>
> The IO inflight counter is not increased from nvme_fail_nonready_command()
> yet, hence decrease it will cause it to be negative.
>
> This is not a problem for blk-mq request because it's already
> initialized before issuing, however, nvme request is only initialized from
> following nvme_setup_cmd(). Fix the problem by clearing it in
> nvme_fail_nonready_command().
>
> Reported-by: Yi Zhang <yi.zhang@...hat.com>
> Closes: https://lore.kernel.org/all/CAHj4cs_+dauobyYyP805t33WMJVzOWj=7+51p4_j9rA63D9sog@mail.gmail.com/
> Signed-off-by: Yu Kuai <yukuai3@...wei.com>
> ---
> drivers/nvme/host/core.c | 3 +++
> 1 file changed, 3 insertions(+)
>
> diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
> index 92697f98c601..8caafa25c010 100644
> --- a/drivers/nvme/host/core.c
> +++ b/drivers/nvme/host/core.c
> @@ -764,6 +764,9 @@ blk_status_t nvme_fail_nonready_command(struct nvme_ctrl *ctrl,
> !test_bit(NVME_CTRL_FAILFAST_EXPIRED, &ctrl->flags) &&
> !blk_noretry_request(rq) && !(rq->cmd_flags & REQ_NVME_MPATH))
> return BLK_STS_RESOURCE;
> +
> + if (!(rq->rq_flags & RQF_DONTPREP))
> + nvme_clear_nvme_request(rq);
> return nvme_host_path_error(rq);
> }
> EXPORT_SYMBOL_GPL(nvme_fail_nonready_command);
Powered by blists - more mailing lists