[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <f92abcda-695a-b427-f342-a3540a45be1a@grimberg.me>
Date: Sun, 20 Aug 2023 13:50:47 +0300
From: Sagi Grimberg <sagi@...mberg.me>
To: Aurelien Aptel <aaptel@...dia.com>, linux-nvme@...ts.infradead.org,
netdev@...r.kernel.org, hch@....de, kbusch@...nel.org, axboe@...com,
chaitanyak@...dia.com, davem@...emloft.net, kuba@...nel.org
Cc: Or Gerlitz <ogerlitz@...dia.com>, aurelien.aptel@...il.com,
smalin@...dia.com, malin1024@...il.com, yorayz@...dia.com,
borisp@...dia.com, galshalom@...dia.com, mgurtovoy@...dia.com
Subject: Re: [PATCH v12 10/26] nvme-tcp: Deal with netdevice DOWN events
>>>>> + switch (event) {
>>>>> + case NETDEV_GOING_DOWN:
>>>>> + mutex_lock(&nvme_tcp_ctrl_mutex);
>>>>> + list_for_each_entry(ctrl, &nvme_tcp_ctrl_list, list) {
>>>>> + if (ndev == ctrl->offloading_netdev)
>>>>> + nvme_tcp_error_recovery(&ctrl->ctrl);
>>>>> + }
>>>>> + mutex_unlock(&nvme_tcp_ctrl_mutex);
>>>>> + flush_workqueue(nvme_reset_wq);
>>>>
>>>> In what context is this called? because every time we flush a workqueue,
>>>> lockdep finds another reason to complain about something...
>>>
>>> Thanks for highlighting this, we re-checked it and we found that we are
>>> covered by nvme_tcp_error_recovery(), we can remove the
>>> flush_workqueue() call above.
>>
>> Don't you need to flush at least err_work? How do you know that it
>> completed and put all the references?
>
> Our bad, we do need to wait for the netdev reference to be put, and we
> must keep the flush_workqueue().
>
> We did test with lockdep but did not notice any warnings.
I'm assuming you are running with lockdep and friends?
>
> As for the context of the event handler when you set the link down is
> the process issuing the netlink syscall.
>
> So if you run "ip link set X down" it would be (simplified):
>
> "ip" -> syscall -> netlink api -> ... -> do_setlink -> call_netdevice_notifiers_info.
ok, that should be fine I think.
Powered by blists - more mailing lists