[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <eaf73b5a-a438-4afe-a76e-148a7d9744ef@grimberg.me>
Date: Sat, 27 Dec 2025 11:48:49 +0200
From: Sagi Grimberg <sagi@...mberg.me>
To: Mohamed Khalfella <mkhalfella@...estorage.com>
Cc: Chaitanya Kulkarni <kch@...dia.com>, Christoph Hellwig <hch@....de>,
Jens Axboe <axboe@...nel.dk>, Keith Busch <kbusch@...nel.org>,
Aaron Dailey <adailey@...estorage.com>,
Randy Jennings <randyj@...estorage.com>, John Meneghini
<jmeneghi@...hat.com>, Hannes Reinecke <hare@...e.de>,
linux-nvme@...ts.infradead.org, linux-kernel@...r.kernel.org
Subject: Re: [RFC PATCH 05/14] nvmet: Send an AEN on CCR completion
On 25/12/2025 20:13, Mohamed Khalfella wrote:
> On Thu 2025-12-25 15:23:51 +0200, Sagi Grimberg wrote:
>>
>> On 26/11/2025 4:11, Mohamed Khalfella wrote:
>>> Send an AEN to initiator when impacted controller exists. The
>>> notification points to CCR log page that initiator can read to check
>>> which CCR operation completed.
>>>
>>> Signed-off-by: Mohamed Khalfella <mkhalfella@...estorage.com>
>>> ---
>>> drivers/nvme/target/core.c | 27 +++++++++++++++++++++++----
>>> drivers/nvme/target/nvmet.h | 3 ++-
>>> include/linux/nvme.h | 3 +++
>>> 3 files changed, 28 insertions(+), 5 deletions(-)
>>>
>>> diff --git a/drivers/nvme/target/core.c b/drivers/nvme/target/core.c
>>> index 7dbe9255ff42..60173833c3eb 100644
>>> --- a/drivers/nvme/target/core.c
>>> +++ b/drivers/nvme/target/core.c
>>> @@ -202,7 +202,7 @@ static void nvmet_async_event_work(struct work_struct *work)
>>> nvmet_async_events_process(ctrl);
>>> }
>>>
>>> -void nvmet_add_async_event(struct nvmet_ctrl *ctrl, u8 event_type,
>>> +static void nvmet_add_async_event_locked(struct nvmet_ctrl *ctrl, u8 event_type,
>>> u8 event_info, u8 log_page)
>>> {
>>> struct nvmet_async_event *aen;
>>> @@ -215,12 +215,17 @@ void nvmet_add_async_event(struct nvmet_ctrl *ctrl, u8 event_type,
>>> aen->event_info = event_info;
>>> aen->log_page = log_page;
>>>
>>> - mutex_lock(&ctrl->lock);
>>> list_add_tail(&aen->entry, &ctrl->async_events);
>>> - mutex_unlock(&ctrl->lock);
>>>
>>> queue_work(nvmet_wq, &ctrl->async_event_work);
>>> }
>>> +void nvmet_add_async_event(struct nvmet_ctrl *ctrl, u8 event_type,
>>> + u8 event_info, u8 log_page)
>>> +{
>>> + mutex_lock(&ctrl->lock);
>>> + nvmet_add_async_event_locked(ctrl, event_type, event_info, log_page);
>>> + mutex_unlock(&ctrl->lock);
>>> +}
>>>
>>> static void nvmet_add_to_changed_ns_log(struct nvmet_ctrl *ctrl, __le32 nsid)
>>> {
>>> @@ -1788,6 +1793,18 @@ struct nvmet_ctrl *nvmet_alloc_ctrl(struct nvmet_alloc_ctrl_args *args)
>>> }
>>> EXPORT_SYMBOL_GPL(nvmet_alloc_ctrl);
>>>
>>> +static void nvmet_ctrl_notify_ccr(struct nvmet_ctrl *ctrl)
>>> +{
>>> + lockdep_assert_held(&ctrl->lock);
>>> +
>>> + if (nvmet_aen_bit_disabled(ctrl, NVME_AEN_BIT_CCR_COMPLETE))
>>> + return;
>>> +
>>> + nvmet_add_async_event_locked(ctrl, NVME_AER_NOTICE,
>>> + NVME_AER_NOTICE_CCR_COMPLETED,
>>> + NVME_LOG_CCR);
>>> +}
>>> +
>>> static void nvmet_ctrl_complete_pending_ccr(struct nvmet_ctrl *ctrl)
>>> {
>>> struct nvmet_subsys *subsys = ctrl->subsys;
>>> @@ -1801,8 +1818,10 @@ static void nvmet_ctrl_complete_pending_ccr(struct nvmet_ctrl *ctrl)
>>> list_for_each_entry(sctrl, &subsys->ctrls, subsys_entry) {
>>> mutex_lock(&sctrl->lock);
>>> list_for_each_entry(ccr, &sctrl->ccrs, entry) {
>>> - if (ccr->ctrl == ctrl)
>>> + if (ccr->ctrl == ctrl) {
>>> + nvmet_ctrl_notify_ccr(sctrl);
>>> ccr->ctrl = NULL;
>>> + }
>> Is this double loop necessary? Would you have more than one controller
>> cross resetting the same
> As it is implemented now CCRs are linked to sctrl. This decision can be
> revisited if found suboptimal. At some point I had CCRs linked to
> ctrl->subsys but that led to lock ordering issues. Double loop is
> necessary to find all CCRs in all controllers and mark them done.
> Yes, it is possible to have more than one sctrl resetting the same
> ictrl.
I'm more interested in simplifying.
>
>> controller? Won't it be better to install a callback+opaque that the
>> controller removal will call?
> Can you elaborate more on that? Better in what terms?
>
> nvmet_ctrl_complete_pending_ccr() is called from nvmet_ctrl_free() when
> we know that ctrl->ref is zero and no new CCRs will be added to this
> controller because nvmet_ctrl_find_get_ccr() will not be able to get it.
In nvmet, the controller is serving a single host. Hence I am not sure I
understand how multiple source controllers will try to reset the impacted
controller. So, if there is a 1-1 relationship between source and impacted
controller, I'd perhaps suggest to simplify and install on the impacted
controller
callback+opaque (e.g. void *data) instead of having it iterate and then
actually send
the AEN from the impacted controller.
Powered by blists - more mailing lists