[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <87a69oc0qs.fsf@nvidia.com>
Date: Mon, 4 Jul 2022 18:47:29 +0300
From: Vlad Buslov <vladbu@...dia.com>
To: Ido Schimmel <idosch@...dia.com>
CC: <netdev@...r.kernel.org>, Maor Dickman <maord@...dia.com>,
Roopa Prabhu <roopa@...dia.com>, <razor@...ckwall.org>
Subject: Bridge VLAN memory leak
Hi Ido,
While implementing QinQ offload support in mlx5 I encountered a memory
leak[0] in the bridge implementation which seems to be related to the new
BR_VLFLAG_ADDED_BY_SWITCHDEV flag that you have recently added.
To reproduce the issue netdevice must support bridge VLAN offload, so I
can't provide a simple script that uses veth or anything like that.
Instead, I'll describe the issue step-by-step:
1. Create a bridge, add offload-capable netdevs to it and assign some
VLAN to them. __vlan_vid_add() function will set the
BR_VLFLAG_ADDED_BY_SWITCHDEV flag since br_switchdev_port_vlan_add()
should return 0 if dev can offload VLANs and will also skip call to
vlan_vid_add() in such case:
/* Try switchdev op first. In case it is not supported, fallback to
* 8021q add.
*/
err = br_switchdev_port_vlan_add(dev, v->vid, flags, false, extack);
if (err == -EOPNOTSUPP)
return vlan_vid_add(dev, br->vlan_proto, v->vid);
v->priv_flags |= BR_VLFLAG_ADDED_BY_SWITCHDEV;
2. Enable filtering and set VLAN protocol to 802.1ad. That will trigger
the following code in __br_vlan_set_proto() that re-creates existing
VLANs with vlan_vid_add() function call whether they have the
BR_VLFLAG_ADDED_BY_SWITCHDEV flag set or not:
/* Add VLANs for the new proto to the device filter. */
list_for_each_entry(p, &br->port_list, list) {
vg = nbp_vlan_group(p);
list_for_each_entry(vlan, &vg->vlan_list, vlist) {
err = vlan_vid_add(p->dev, proto, vlan->vid);
if (err)
goto err_filt;
}
}
3. Now delete the bridge. That will delete all existing VLANs via
__vlan_vid_del() function, which skips calling vlan_vid_del() (that is
necessary to clean up after vlan_vid_add()) if VLAN has
BR_VLFLAG_ADDED_BY_SWITCHDEV flag set:
/* Try switchdev op first. In case it is not supported, fallback to
* 8021q del.
*/
err = br_switchdev_port_vlan_del(dev, v->vid);
if (!(v->priv_flags & BR_VLFLAG_ADDED_BY_SWITCHDEV))
vlan_vid_del(dev, br->vlan_proto, v->vid);
The issue doesn't reproduce for me anymore if I just clear the
BR_VLFLAG_ADDED_BY_SWITCHDEV flag when re-creating VLANs on step 2.
However, I'm not sure whether it is the right approach in this case.
WDYT?
[0]:
unreferenced object 0xffff8881f6771200 (size 256):
comm "ip", pid 446855, jiffies 4298238841 (age 55.240s)
hex dump (first 32 bytes):
00 00 7f 0e 83 88 ff ff 00 00 00 00 00 00 00 00 ................
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
backtrace:
[<00000000012819ac>] vlan_vid_add+0x437/0x750
[<00000000f2281fad>] __br_vlan_set_proto+0x289/0x920
[<000000000632b56f>] br_changelink+0x3d6/0x13f0
[<0000000089d25f04>] __rtnl_newlink+0x8ae/0x14c0
[<00000000f6276baf>] rtnl_newlink+0x5f/0x90
[<00000000746dc902>] rtnetlink_rcv_msg+0x336/0xa00
[<000000001c2241c0>] netlink_rcv_skb+0x11d/0x340
[<0000000010588814>] netlink_unicast+0x438/0x710
[<00000000e1a4cd5c>] netlink_sendmsg+0x788/0xc40
[<00000000e8992d4e>] sock_sendmsg+0xb0/0xe0
[<00000000621b8f91>] ____sys_sendmsg+0x4ff/0x6d0
[<000000000ea26996>] ___sys_sendmsg+0x12e/0x1b0
[<00000000684f7e25>] __sys_sendmsg+0xab/0x130
[<000000004538b104>] do_syscall_64+0x3d/0x90
[<0000000091ed9678>] entry_SYSCALL_64_after_hwframe+0x46/0xb0
Regards,
Vlad
Powered by blists - more mailing lists