[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20251205-myopia-been-ed739f86@mheyne-amazon>
Date: Fri, 5 Dec 2025 07:33:10 +0000
From: "Heyne, Maximilian" <mheyne@...zon.de>
To: Keith Busch <kbusch@...nel.org>
CC: Jens Axboe <axboe@...nel.dk>, Christoph Hellwig <hch@....de>, "Sagi
Grimberg" <sagi@...mberg.me>, "linux-nvme@...ts.infradead.org"
<linux-nvme@...ts.infradead.org>, "linux-kernel@...r.kernel.org"
<linux-kernel@...r.kernel.org>
Subject: Re: [PATCH v2] nvme: Let the blocklayer set timeouts for requests
On Thu, Dec 04, 2025 at 07:34:26PM +0000, Heyne, Maximilian wrote:
> On Thu, Dec 04, 2025 at 09:13:35AM -0700, Keith Busch wrote:
> > On Thu, Dec 04, 2025 at 02:11:50PM +0000, Heyne, Maximilian wrote:
> > > @@ -724,10 +724,8 @@ void nvme_init_request(struct request *req, struct nvme_command *cmd)
> > > struct nvme_ns *ns = req->q->disk->private_data;
> > >
> > > logging_enabled = ns->head->passthru_err_log_enabled;
> > > - req->timeout = NVME_IO_TIMEOUT;
> > > } else { /* no queuedata implies admin queue */
> > > logging_enabled = nr->ctrl->passthru_err_log_enabled;
> > > - req->timeout = NVME_ADMIN_TIMEOUT;
> > > }
> >
> > I was trying to think of any in-kernel path using __submit_sync_cmd with
> > an IO queue, and quick search shows there's just one: zns report zones.
> >
> > Everything else uses the admin queue, which doesn't have a sysfs tunable
> > for its request_queue's default timeout. All we have is the nvme module
> > parameter, which is writable after loading. Since that's the only way a
> > user can modify the default time for that queue, I think we need to
> > leave that req->timeout value as-is.
>
> Ok sound like a v3 is needed where I only delete the line with
> NVME_IO_TIMEOUT but leave the NVME_ADMIN_TIMEOUT and add a comment about
> it. Will prepare such a patch.
>
I thought about this a bit more. Considering that the module parameters
can be written to produces even more inconsistencies.
With my proposed change we will have the following situation:
1) when a device is probed current settings for IO and admin timeout from
the module parameters will be the respective default timeouts
2) almost all admin commands to the device will default to the initial
admin timeout
3) almost all IO commands will adhere to whatever is set via sysfs or
the initial default
-> changes to the module parameters will (mostly) not affect already
probed devices
When we keep initializing the admin timeouts to the module parameter as
you suggest (so half of my patch), we'll have the following situation:
1) same as above
2) some admin commands will use the module parameters admin timeout,
some (ioctl's via nvme_submit_user_cmd) will use the admin queue's timeout set in 1).
This can be made consistent if we fix nvme_submit_user_cmd.
3) same as above; IO timeout not affected by module parameter changes.
-> changes to the module parameters affect only the admin commands but
(mostly) not the IO commands which is inconsistent behavior.
Therefore, I wonder whether people are actually making changes to the
module parameters at runtime to adjust all device's admin timeouts. In
that case it's at least broken for ioctl's currently.
Should we make that timeout runtime configurable per-device too instead?
In summary, I think my patch as-is leads to better consistency.
Amazon Web Services Development Center Germany GmbH
Tamara-Danz-Str. 13
10243 Berlin
Geschaeftsfuehrung: Christian Schlaeger, Christof Hellmis
Eingetragen am Amtsgericht Charlottenburg unter HRB 257764 B
Sitz: Berlin
Ust-ID: DE 365 538 597
Powered by blists - more mailing lists