netdev - Re: [PATCH v12 10/26] nvme-tcp: Deal with netdevice DOWN events

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <253bkf5vjyu.fsf@nvidia.com>
Date: Thu, 17 Aug 2023 17:09:45 +0300
From: Aurelien Aptel <aaptel@...dia.com>
To: Sagi Grimberg <sagi@...mberg.me>, linux-nvme@...ts.infradead.org,
 netdev@...r.kernel.org, hch@....de, kbusch@...nel.org, axboe@...com,
 chaitanyak@...dia.com, davem@...emloft.net, kuba@...nel.org
Cc: Or Gerlitz <ogerlitz@...dia.com>, aurelien.aptel@...il.com,
 smalin@...dia.com, malin1024@...il.com, yorayz@...dia.com,
 borisp@...dia.com, galshalom@...dia.com, mgurtovoy@...dia.com
Subject: Re: [PATCH v12 10/26] nvme-tcp: Deal with netdevice DOWN events

Sagi Grimberg <sagi@...mberg.me> writes:
>>>> +     switch (event) {
>>>> +     case NETDEV_GOING_DOWN:
>>>> +             mutex_lock(&nvme_tcp_ctrl_mutex);
>>>> +             list_for_each_entry(ctrl, &nvme_tcp_ctrl_list, list) {
>>>> +                     if (ndev == ctrl->offloading_netdev)
>>>> +                             nvme_tcp_error_recovery(&ctrl->ctrl);
>>>> +             }
>>>> +             mutex_unlock(&nvme_tcp_ctrl_mutex);
>>>> +             flush_workqueue(nvme_reset_wq);
>>>
>>> In what context is this called? because every time we flush a workqueue,
>>> lockdep finds another reason to complain about something...
>>
>> Thanks for highlighting this, we re-checked it and we found that we are
>> covered by nvme_tcp_error_recovery(), we can remove the
>> flush_workqueue() call above.
>
> Don't you need to flush at least err_work? How do you know that it
> completed and put all the references?

Our bad, we do need to wait for the netdev reference to be put, and we
must keep the flush_workqueue().

We did test with lockdep but did not notice any warnings.

As for the context of the event handler when you set the link down is
the process issuing the netlink syscall.

So if you run "ip link set X down" it would be (simplified):

"ip" -> syscall -> netlink api -> ... -> do_setlink -> call_netdevice_notifiers_info.