lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <d9a7af40-84f9-446e-a708-d989b322a675@redhat.com>
Date: Tue, 1 Apr 2025 09:45:08 +0200
From: Paolo Abeni <pabeni@...hat.com>
To: Dylan Wolff <wolffd@...p.nus.edu.sg>, Wenjia Zhang
 <wenjia@...ux.ibm.com>, Jan Karcher <jaka@...ux.ibm.com>,
 "D. Wythe" <alibuda@...ux.alibaba.com>, Tony Lu <tonylu@...ux.alibaba.com>,
 Wen Gu <guwen@...ux.alibaba.com>, "David S. Miller" <davem@...emloft.net>,
 Eric Dumazet <edumazet@...gle.com>, Jakub Kicinski <kuba@...nel.org>,
 Simon Horman <horms@...nel.org>, linux-rdma@...r.kernel.org,
 linux-s390@...r.kernel.org, netdev@...r.kernel.org,
 linux-kernel@...r.kernel.org
Cc: Jiacheng Xu <3170103308@....edu.cn>
Subject: Re: Concurrent slab-use-after-free in netdev_next_lower_dev

On 3/31/25 3:00 PM, Dylan Wolff wrote:
> From the report, it looks like the net_device is freed at the end of an
> rtnl critical section in netdev_run_todo. At the time of the crash, the
> *use* thread has acquired rtnl_lock() in smc_vlan_by_tcpsk. The crash
> occurred at the line preceded by `>>>` below in 6.13 rc4 while iterating
> over devices with netdev_walk_all_lower_dev:
> 
> ```
> static struct net_device *netdev_next_lower_dev(struct net_device *dev,
> struct list_head **iter)
> {
> struct netdev_adjacent *lower;
> 
>>>> lower = list_entry((*iter)->next, struct netdev_adjacent, list);
> 
> if (&lower->list == &dev->adj_list.lower)
> return NULL;
> 
> *iter = &lower->list;
> 
> return lower->dev;
> }
> ```

Please share the decoded syzkaller/kasan splat in plaintext instead of
describing it in natural language, and please avoid attachments unless
explicitly asked for.

Also, can you reproduce the issue on top of the current net tree?

> This looks to me like it is an issue with reference counting; I see that
> netdev_refcnt_read is checked in netdev_run_todo before the device is
> freed, but I don't see anything in netdev_walk_all_lower_dev /
> netdev_next_lower_dev that is incrementing netdev_refcnt_read when it is
> iterating over the devices. I'm guessing the fix is to either add reference
> counting to netdev_walk_all_lower_dev or to use a different,
> concurrency-safe iterator over the devices in the caller (smc_vlan_by_tcpsk
> ).
> 
> Could someone confirm if I am on the right track here? If so I am happy to
> try to come up with the patch.

netdev_walk_all_lower_dev() should not need additional refcounting, as
it is traversing the list under rtnl lock, and device should be removed
from the adjacency list  before the actual device free under such lock, too.

Thanks,

Paolo


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ