netdev - Re: [PATCH v12 10/26] nvme-tcp: Deal with netdevice DOWN events

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <f92abcda-695a-b427-f342-a3540a45be1a@grimberg.me>
Date: Sun, 20 Aug 2023 13:50:47 +0300
From: Sagi Grimberg <sagi@...mberg.me>
To: Aurelien Aptel <aaptel@...dia.com>, linux-nvme@...ts.infradead.org,
 netdev@...r.kernel.org, hch@....de, kbusch@...nel.org, axboe@...com,
 chaitanyak@...dia.com, davem@...emloft.net, kuba@...nel.org
Cc: Or Gerlitz <ogerlitz@...dia.com>, aurelien.aptel@...il.com,
 smalin@...dia.com, malin1024@...il.com, yorayz@...dia.com,
 borisp@...dia.com, galshalom@...dia.com, mgurtovoy@...dia.com
Subject: Re: [PATCH v12 10/26] nvme-tcp: Deal with netdevice DOWN events

>>>>> +     switch (event) {
>>>>> +     case NETDEV_GOING_DOWN:
>>>>> +             mutex_lock(&nvme_tcp_ctrl_mutex);
>>>>> +             list_for_each_entry(ctrl, &nvme_tcp_ctrl_list, list) {
>>>>> +                     if (ndev == ctrl->offloading_netdev)
>>>>> +                             nvme_tcp_error_recovery(&ctrl->ctrl);
>>>>> +             }
>>>>> +             mutex_unlock(&nvme_tcp_ctrl_mutex);
>>>>> +             flush_workqueue(nvme_reset_wq);
>>>>
>>>> In what context is this called? because every time we flush a workqueue,
>>>> lockdep finds another reason to complain about something...
>>>
>>> Thanks for highlighting this, we re-checked it and we found that we are
>>> covered by nvme_tcp_error_recovery(), we can remove the
>>> flush_workqueue() call above.
>>
>> Don't you need to flush at least err_work? How do you know that it
>> completed and put all the references?
> 
> Our bad, we do need to wait for the netdev reference to be put, and we
> must keep the flush_workqueue().
> 
> We did test with lockdep but did not notice any warnings.

I'm assuming you are running with lockdep and friends?

> 
> As for the context of the event handler when you set the link down is
> the process issuing the netlink syscall.
> 
> So if you run "ip link set X down" it would be (simplified):
> 
> "ip" -> syscall -> netlink api -> ... -> do_setlink -> call_netdevice_notifiers_info.

ok, that should be fine I think.