[<prev] [next>] [day] [month] [year] [list]
Message-ID: <20250623113008.695446-1-dongchenchen2@huawei.com>
Date: Mon, 23 Jun 2025 19:30:08 +0800
From: Dong Chenchen <dongchenchen2@...wei.com>
To: <davem@...emloft.net>, <edumazet@...gle.com>, <kuba@...nel.org>,
<pabeni@...hat.com>, <horms@...nel.org>, <jiri@...nulli.us>,
<oscmaes92@...il.com>, <linux@...blig.org>, <pedro.netdev@...devamos.com>
CC: <netdev@...r.kernel.org>, <linux-kernel@...r.kernel.org>,
<yuehaibing@...wei.com>, <zhangchangzhong@...wei.com>
Subject: [PATCH net] net: vlan: fix VLAN 0 refcount imbalance of toggling filtering during runtime
8021q(vlan_device_event) will add VLAN 0 when enabling the device, and
remove it on disabling it if NETIF_F_HW_VLAN_CTAG_FILTER set.
However, if changing filter feature during netdev runtime,
null-ptr-unref[1] or bug_on[2] will be triggered by unregister_vlan_dev()
for refcount imbalance.
[1]
BUG: KASAN: null-ptr-deref in unregister_vlan_dev (net/8021q/vlan.h:90 net/8021q/vlan.c:110)
Write of size 8 at addr 0000000000000000 by task ip/382
CPU: 2 UID: 0 PID: 382 Comm: ip Not tainted 6.16.0-rc3 #60 PREEMPT(voluntary)
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996),
BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014
Call Trace:
<TASK>
dump_stack_lvl (lib/dump_stack.c:123)
kasan_report (mm/kasan/report.c:636)
unregister_vlan_dev (net/8021q/vlan.h:90 net/8021q/vlan.c:110)
rtnl_dellink (net/core/rtnetlink.c:3511 net/core/rtnetlink.c:3553)
rtnetlink_rcv_msg (net/core/rtnetlink.c:6945)
netlink_rcv_skb (net/netlink/af_netlink.c:2535)
netlink_unicast (net/netlink/af_netlink.c:1314 net/netlink/af_netlink.c:1339)
netlink_sendmsg (net/netlink/af_netlink.c:1883)
____sys_sendmsg (net/socket.c:712 net/socket.c:727 net/socket.c:2566)
[2]
kernel BUG at net/8021q/vlan.c:99!
Oops: invalid opcode: 0000 [#1] SMP KASAN PTI
CPU: 0 UID: 0 PID: 382 Comm: ip Not tainted 6.16.0-rc3 #61 PREEMPT(voluntary)
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996),
BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014
RIP: 0010:unregister_vlan_dev (net/8021q/vlan.c:99 (discriminator 1))
RSP: 0018:ffff88810badf310 EFLAGS: 00010246
RAX: 0000000000000000 RBX: ffff88810da84000 RCX: ffffffffb47ceb9a
RDX: dffffc0000000000 RSI: 0000000000000008 RDI: ffff88810e8b43c8
RBP: 0000000000000000 R08: 0000000000000000 R09: fffffbfff6cefe80
R10: ffffffffb677f407 R11: ffff88810badf3c0 R12: ffff88810e8b4000
R13: 0000000000000000 R14: ffff88810642a5c0 R15: 000000000000017e
FS: 00007f1ff68c20c0(0000) GS:ffff888163a24000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007f1ff5dad240 CR3: 0000000107e56000 CR4: 00000000000006f0
Call Trace:
<TASK>
rtnl_dellink (net/core/rtnetlink.c:3511 net/core/rtnetlink.c:3553)
rtnetlink_rcv_msg (net/core/rtnetlink.c:6945)
netlink_rcv_skb (net/netlink/af_netlink.c:2535)
netlink_unicast (net/netlink/af_netlink.c:1314 net/netlink/af_netlink.c:1339)
netlink_sendmsg (net/netlink/af_netlink.c:1883)
____sys_sendmsg (net/socket.c:712 net/socket.c:727 net/socket.c:2566)
___sys_sendmsg (net/socket.c:2622)
__sys_sendmsg (net/socket.c:2652)
do_syscall_64 (arch/x86/entry/syscall_64.c:63 arch/x86/entry/syscall_64.c:94)
Root cause is as below:
step1: add vlan0 for real_dev, such as bond, team.
register_vlan_dev
vlan_vid_add(real_dev,htons(ETH_P_8021Q),0) //refcnt=1
step2: disable vlan filter feature and enable real_dev
step3: change filter from 0 to 1
vlan_device_event
vlan_filter_push_vids
ndo_vlan_rx_add_vid //No refcnt added to real_dev vlan0
step4: real_dev down
vlan_device_event
vlan_vid_del(dev, htons(ETH_P_8021Q), 0); //refcnt=0
vlan_info_rcu_free //free vlan0
step5: real_dev up
vlan_device_event
vlan_vid_add(dev, htons(ETH_P_8021Q), 0);
vlan_info_alloc //alloc new empty vid0. refcnt=1
step6: delete vlan0
unregister_vlan_dev
BUG_ON(!vlan_info); //will trigger it if step5 was not executed
vlan_group_set_device
array = vg->vlan_devices_arrays
//null-ptr-ref will be triggered after step5
E.g. the following sequence can reproduce null-ptr-ref
$ ip link add bond0 type bond mode 0
$ ip link add link bond0 name vlan0 type vlan id 0 protocol 802.1q
$ ethtool -K bond0 rx-vlan-filter off
$ ifconfig bond0 up
$ ethtool -K bond0 rx-vlan-filter on
$ ifconfig bond0 down
$ ifconfig bond0 up
$ ip link del vlan0
Fix it by correctly modifying vid0 refcnt of real_dev when changing
the NETIF_F_HW_VLAN_CTAG_FILTER flag during runtime.
Fixes: ad1afb003939 ("vlan_dev: VLAN 0 should be treated as "no vlan tag" (802.1p packet)")
Reported-by: syzbot+a8b046e462915c65b10b@...kaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=a8b046e462915c65b10b
Signed-off-by: Dong Chenchen <dongchenchen2@...wei.com>
---
net/8021q/vlan.c | 9 +++++++++
1 file changed, 9 insertions(+)
diff --git a/net/8021q/vlan.c b/net/8021q/vlan.c
index 06908e37c3d9..6e01ece0a95c 100644
--- a/net/8021q/vlan.c
+++ b/net/8021q/vlan.c
@@ -504,12 +504,21 @@ static int vlan_device_event(struct notifier_block *unused, unsigned long event,
break;
case NETDEV_CVLAN_FILTER_PUSH_INFO:
+ flgs = dev_get_flags(dev);
+ if (flgs & IFF_UP) {
+ pr_info("adding VLAN 0 to HW filter on device %s\n",
+ dev->name);
+ vlan_vid_add(dev, htons(ETH_P_8021Q), 0);
+ }
err = vlan_filter_push_vids(vlan_info, htons(ETH_P_8021Q));
if (err)
return notifier_from_errno(err);
break;
case NETDEV_CVLAN_FILTER_DROP_INFO:
+ flgs = dev_get_flags(dev);
+ if (flgs & IFF_UP)
+ vlan_vid_del(dev, htons(ETH_P_8021Q), 0);
vlan_filter_drop_vids(vlan_info, htons(ETH_P_8021Q));
break;
--
2.25.1
Powered by blists - more mailing lists