[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20250925155950.GA4013-mkhalfella@purestorage.com>
Date: Thu, 25 Sep 2025 08:59:50 -0700
From: Mohamed Khalfella <mkhalfella@...estorage.com>
To: Keith Busch <kbusch@...nel.org>
Cc: Amit Chaudhary <achaudhary@...estorage.com>,
Jens Axboe <axboe@...nel.dk>, Christoph Hellwig <hch@....de>,
Sagi Grimberg <sagi@...mberg.me>, randyj@...estorage.com,
jmeneghi@...hat.com, emilne@...hat.com,
linux-nvme@...ts.infradead.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH 1/1] nvme-multipath: Skip nr_active increments in RETRY
disposition
On 2025-09-25 08:43:44 -0600, Keith Busch wrote:
> On Wed, Sep 24, 2025 at 06:14:27PM -0700, Mohamed Khalfella wrote:
> > On 2025-09-24 17:02:51 -0600, Keith Busch wrote:
> > > On Wed, Sep 24, 2025 at 03:43:18PM -0700, Amit Chaudhary wrote:
> > > > static inline void nvme_start_request(struct request *rq)
> > > > {
> > > > - if (rq->cmd_flags & REQ_NVME_MPATH)
> > > > + if ((rq->cmd_flags & REQ_NVME_MPATH) && (!nvme_req(rq)->retries))
> > > > nvme_mpath_start_request(rq);
> > > > blk_mq_start_request(rq);
> > > > }
> > >
> > > Using "retries" is bit indirect as a proxy for multipath active counts.
> > > Could this be moved to the mpath start instead, directly using the flag
> > > that accounts for the path? This also helps to keep track if the command
> > > gets retried across a user toggling the policy to "qd".
> > >
> > > ---
> > > diff --git a/drivers/nvme/host/multipath.c b/drivers/nvme/host/multipath.c
> > > index 3da980dc60d91..1c630967ddd40 100644
> > > --- a/drivers/nvme/host/multipath.c
> > > +++ b/drivers/nvme/host/multipath.c
> > > @@ -182,7 +182,8 @@ void nvme_mpath_start_request(struct request *rq)
> > > struct nvme_ns *ns = rq->q->queuedata;
> > > struct gendisk *disk = ns->head->disk;
> > >
> > > - if (READ_ONCE(ns->head->subsys->iopolicy) == NVME_IOPOLICY_QD) {
> > > + if (READ_ONCE(ns->head->subsys->iopolicy) == NVME_IOPOLICY_QD &&
> > > + !(nvme_req(rq)->flags & NVME_MPATH_CNT_ACTIVE)) {
> > > atomic_inc(&ns->ctrl->nr_active);
> > > nvme_req(rq)->flags |= NVME_MPATH_CNT_ACTIVE;
> > > }
> > > --
> >
> > 193 nvme_req(rq)->flags |= NVME_MPATH_IO_STATS;
> > 194 nvme_req(rq)->start_time = bdev_start_io_acct(disk->part0, req_op(rq),
> > 195 jiffies);
> >
> > Doing it this way might messup with stats accounting because the two
> > lines above will be executed on request retry. I do not think we need
> > that, right?
>
> Yeah, but we can use the other flag to know if it's already been
> accounted:
>
> --- a/drivers/nvme/host/multipath.c
> +++ b/drivers/nvme/host/multipath.c
> @@ -182,12 +182,14 @@ void nvme_mpath_start_request(struct request *rq)
> struct nvme_ns *ns = rq->q->queuedata;
> struct gendisk *disk = ns->head->disk;
>
> - if (READ_ONCE(ns->head->subsys->iopolicy) == NVME_IOPOLICY_QD) {
> + if (READ_ONCE(ns->head->subsys->iopolicy) == NVME_IOPOLICY_QD &&
> + !(nvme_req(rq)->flags & NVME_MPATH_CNT_ACTIVE)) {
> atomic_inc(&ns->ctrl->nr_active);
> nvme_req(rq)->flags |= NVME_MPATH_CNT_ACTIVE;
> }
>
> - if (!blk_queue_io_stat(disk->queue) || blk_rq_is_passthrough(rq))
> + if (!blk_queue_io_stat(disk->queue) || blk_rq_is_passthrough(rq) ||
> + nvme_req(rq)->flags & NVME_MPATH_IO_STATS)
> return;
>
> nvme_req(rq)->flags |= NVME_MPATH_IO_STATS;
This works. However, I find Amit's change more straight forward to
understand. nvme_mpath_start_request()/nvme_mpath_end_request() are
called when request started/ended respectively. For a request that has
been retried on the same path nvme_mpath_start_request() need not be
called again. Such retry should be transparent to multipath layer.
Powered by blists - more mailing lists