lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <e32387f1-fcf6-41fd-89a9-0b54637c99c9@intel.com>
Date: Thu, 16 Jan 2025 14:59:01 +0100
From: Przemek Kitszel <przemyslaw.kitszel@...el.com>
To: Antoine Tenart <atenart@...nel.org>
CC: <ecree.xilinx@...il.com>, <netdev@...r.kernel.org>, <davem@...emloft.net>,
	<kuba@...nel.org>, <pabeni@...hat.com>, <edumazet@...gle.com>
Subject: Re: [PATCH net v2] net: avoid race between device unregistration and
 ethnl ops

On 1/16/25 14:47, Antoine Tenart wrote:
> Quoting Przemek Kitszel (2025-01-16 10:44:40)
>> On 1/16/25 10:21, Antoine Tenart wrote:
>>> The following trace can be seen if a device is being unregistered while
>>> its number of channels are being modified.
>>>
>>>     DEBUG_LOCKS_WARN_ON(lock->magic != lock)
>>>     WARNING: CPU: 3 PID: 3754 at kernel/locking/mutex.c:564 __mutex_lock+0xc8a/0x1120
>>>     CPU: 3 UID: 0 PID: 3754 Comm: ethtool Not tainted 6.13.0-rc6+ #771
>>>     RIP: 0010:__mutex_lock+0xc8a/0x1120
>>>     Call Trace:
>>>      <TASK>
>>>      ethtool_check_max_channel+0x1ea/0x880
>>>      ethnl_set_channels+0x3c3/0xb10
>>>      ethnl_default_set_doit+0x306/0x650
>>>      genl_family_rcv_msg_doit+0x1e3/0x2c0
>>>      genl_rcv_msg+0x432/0x6f0
>>>      netlink_rcv_skb+0x13d/0x3b0
>>>      genl_rcv+0x28/0x40
>>>      netlink_unicast+0x42e/0x720
>>>      netlink_sendmsg+0x765/0xc20
>>>      __sys_sendto+0x3ac/0x420
>>>      __x64_sys_sendto+0xe0/0x1c0
>>>      do_syscall_64+0x95/0x180
>>>      entry_SYSCALL_64_after_hwframe+0x76/0x7e
>>>
>>> This is because unregister_netdevice_many_notify might run before the
>>> rtnl lock section of ethnl operations, eg. set_channels in the above
>>> example. In this example the rss lock would be destroyed by the device
>>> unregistration path before being used again, but in general running
>>> ethnl operations while dismantle has started is not a good idea.
>>>
>>> Fix this by denying any operation on devices being unregistered. A check
>>> was already there in ethnl_ops_begin, but not wide enough.
>>>
>>> Note that the same issue cannot be seen on the ioctl version
>>> (__dev_ethtool) because the device reference is retrieved from within
>>> the rtnl lock section there. Once dismantle started, the net device is
>>> unlisted and no reference will be found.
>>>
>>> Fixes: dde91ccfa25f ("ethtool: do not perform operations on net devices being unregistered")
>>> Signed-off-by: Antoine Tenart <atenart@...nel.org>
>>> ---
>>
>> for future submissions, please add a changelog and a link to previous
>> revisions
> 
> This one was a bit special as v2 is completely different from v1, not
> much to describe. But sure, at least a link could help.
> 
>>>    net/ethtool/netlink.c | 2 +-
>>>    1 file changed, 1 insertion(+), 1 deletion(-)
>>>
>>> diff --git a/net/ethtool/netlink.c b/net/ethtool/netlink.c
>>> index e3f0ef6b851b..4d18dc29b304 100644
>>> --- a/net/ethtool/netlink.c
>>> +++ b/net/ethtool/netlink.c
>>> @@ -90,7 +90,7 @@ int ethnl_ops_begin(struct net_device *dev)
>>>                pm_runtime_get_sync(dev->dev.parent);
>>>    
>>>        if (!netif_device_present(dev) ||
>>> -         dev->reg_state == NETREG_UNREGISTERING) {
>>> +         dev->reg_state >= NETREG_UNREGISTERING) {
>>
>> looks good, but I would add a comment above enum netdev_reg_state
>> definition, to avoid any new state added "at the end"

with your interest of more improvements in the area, current patch is
fine for me as a fix, so:
Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@...el.com>

>>
>> what about NETREG_DUMMY? you want to cover it here too?
> 
> I'm not super familiar with NETREG_DUMMY but my understanding is those
> devices aren't listed and aren't accessible through ethnl.
> 
> Having said that I do agree the checks on reg_state could be
> consolidated, eg. reusing and improving dev_isalive(). I actually
> planned to have a look at if this would make sense later on.
> 
> tl;dr; I don't think there's an issue in practice but we could probably
> consolidate the code to make things easier to maintain and to read.

agree

> 
> Thanks,
> Antoine


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ