[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <738a41ca-3e4a-48df-9424-2950e6efc082@grimberg.me>
Date: Tue, 15 Apr 2025 01:28:15 +0300
From: Sagi Grimberg <sagi@...mberg.me>
To: Mohamed Khalfella <mkhalfella@...estorage.com>,
Daniel Wagner <wagi@...nel.org>
Cc: Christoph Hellwig <hch@....de>, Keith Busch <kbusch@...nel.org>,
Hannes Reinecke <hare@...e.de>, John Meneghini <jmeneghi@...hat.com>,
randyj@...estorage.com, linux-nvme@...ts.infradead.org,
linux-kernel@...r.kernel.org
Subject: Re: [PATCH RFC 3/3] nvme: delay failover by command quiesce timeout
On 10/04/2025 11:51, Mohamed Khalfella wrote:
> On 2025-03-24 13:07:58 +0100, Daniel Wagner wrote:
>> The TP4129 mendates that the failover should be delayed by CQT. Thus when
>> nvme_decide_disposition returns FAILOVER do not immediately re-queue it on
>> the namespace level instead queue it on the ctrl's request_list and
>> moved later to the namespace's requeue_list.
>>
>> Signed-off-by: Daniel Wagner <wagi@...nel.org>
>> ---
>> drivers/nvme/host/core.c | 19 ++++++++++++++++
>> drivers/nvme/host/fc.c | 4 ++++
>> drivers/nvme/host/multipath.c | 52 ++++++++++++++++++++++++++++++++++++++++---
>> drivers/nvme/host/nvme.h | 15 +++++++++++++
>> drivers/nvme/host/rdma.c | 2 ++
>> drivers/nvme/host/tcp.c | 1 +
>> 6 files changed, 90 insertions(+), 3 deletions(-)
>>
>> diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
>> index 135045528ea1c79eac0d6d47d5f7f05a7c98acc4..f3155c7735e75e06c4359c26db8931142c067e1d 100644
>> --- a/drivers/nvme/host/core.c
>> +++ b/drivers/nvme/host/core.c
>> @@ -239,6 +239,7 @@ static void nvme_do_delete_ctrl(struct nvme_ctrl *ctrl)
>>
>> flush_work(&ctrl->reset_work);
>> nvme_stop_ctrl(ctrl);
>> + nvme_flush_failover(ctrl);
>> nvme_remove_namespaces(ctrl);
>> ctrl->ops->delete_ctrl(ctrl);
>> nvme_uninit_ctrl(ctrl);
>> @@ -1310,6 +1311,19 @@ static void nvme_queue_keep_alive_work(struct nvme_ctrl *ctrl)
>> queue_delayed_work(nvme_wq, &ctrl->ka_work, delay);
>> }
>>
>> +void nvme_schedule_failover(struct nvme_ctrl *ctrl)
>> +{
>> + unsigned long delay;
>> +
>> + if (ctrl->cqt)
>> + delay = msecs_to_jiffies(ctrl->cqt);
>> + else
>> + delay = ctrl->kato * HZ;
> I thought that delay = m * ctrl->kato + ctrl->cqt
> where m = ctrl->ctratt & NVME_CTRL_ATTR_TBKAS ? 3 : 2
> no?
This was said before, but if we are going to always start waiting for
kato for failover purposes,
we first need a patch that prevent kato from being arbitrarily long.
Lets cap kato to something like 10 seconds (which is 2x the default
which apparently no one is touching).
Powered by blists - more mailing lists