lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Tue, 5 Jul 2022 16:55:25 +0300
From:   Ido Schimmel <idosch@...dia.com>
To:     Vlad Buslov <vladbu@...dia.com>
Cc:     netdev@...r.kernel.org, Maor Dickman <maord@...dia.com>,
        Roopa Prabhu <roopa@...dia.com>, razor@...ckwall.org
Subject: Re: Bridge VLAN memory leak

On Mon, Jul 04, 2022 at 06:47:29PM +0300, Vlad Buslov wrote:
> Hi Ido,
> 
> While implementing QinQ offload support in mlx5 I encountered a memory
> leak[0] in the bridge implementation which seems to be related to the new
> BR_VLFLAG_ADDED_BY_SWITCHDEV flag that you have recently added.

FTR, added here:
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=279737939a8194f02fa352ab4476a1b241f44ef4

> 
> To reproduce the issue netdevice must support bridge VLAN offload, so I
> can't provide a simple script that uses veth or anything like that.
> Instead, I'll describe the issue step-by-step:
> 
> 1. Create a bridge, add offload-capable netdevs to it and assign some
> VLAN to them. __vlan_vid_add() function will set the
> BR_VLFLAG_ADDED_BY_SWITCHDEV flag since br_switchdev_port_vlan_add()
> should return 0 if dev can offload VLANs and will also skip call to
> vlan_vid_add() in such case:
> 
>         /* Try switchdev op first. In case it is not supported, fallback to
>          * 8021q add.
>          */
>         err = br_switchdev_port_vlan_add(dev, v->vid, flags, false, extack);
>         if (err == -EOPNOTSUPP)
>                 return vlan_vid_add(dev, br->vlan_proto, v->vid);
>         v->priv_flags |= BR_VLFLAG_ADDED_BY_SWITCHDEV;
> 
> 2. Enable filtering and set VLAN protocol to 802.1ad. That will trigger
> the following code in __br_vlan_set_proto() that re-creates existing
> VLANs with vlan_vid_add() function call whether they have the
> BR_VLFLAG_ADDED_BY_SWITCHDEV flag set or not:
> 
>          /* Add VLANs for the new proto to the device filter. */
>          list_for_each_entry(p, &br->port_list, list) {
>                  vg = nbp_vlan_group(p);
>                  list_for_each_entry(vlan, &vg->vlan_list, vlist) {
>                          err = vlan_vid_add(p->dev, proto, vlan->vid);
>                          if (err)
>                                  goto err_filt;
>                  }
>          }
> 
> 3. Now delete the bridge. That will delete all existing VLANs via
> __vlan_vid_del() function, which skips calling vlan_vid_del() (that is
> necessary to clean up after vlan_vid_add()) if VLAN has
> BR_VLFLAG_ADDED_BY_SWITCHDEV flag set:
> 
>          /* Try switchdev op first. In case it is not supported, fallback to
>           * 8021q del.
>           */
>          err = br_switchdev_port_vlan_del(dev, v->vid);
>          if (!(v->priv_flags & BR_VLFLAG_ADDED_BY_SWITCHDEV))
>                  vlan_vid_del(dev, br->vlan_proto, v->vid);

Looking at the code before the change, I'm pretty sure you will be able
to reproduce the leak prior to above mentioned commit:

```
-	err = br_switchdev_port_vlan_del(dev, vid);
-	if (err == -EOPNOTSUPP) {
-		vlan_vid_del(dev, br->vlan_proto, vid);
-		return 0;
-	}
-	return err;
```

> 
> 
> The issue doesn't reproduce for me anymore if I just clear the
> BR_VLFLAG_ADDED_BY_SWITCHDEV flag when re-creating VLANs on step 2.
> However, I'm not sure whether it is the right approach in this case.
> WDYT?

As a switchdev driver you already know about the new VLAN protocol via
'SWITCHDEV_ATTR_ID_BRIDGE_VLAN_PROTOCOL' so do we really need the VLANs
to be programmed again? The VLAN protocol is not communicated in
'SWITCHDEV_OBJ_ID_PORT_VLAN' anyway.

Can you try the below (compile tested only)?

```
diff --git a/net/bridge/br_vlan.c b/net/bridge/br_vlan.c
index 6e53dc991409..9ffd40b8270c 100644
--- a/net/bridge/br_vlan.c
+++ b/net/bridge/br_vlan.c
@@ -959,6 +959,8 @@ int __br_vlan_set_proto(struct net_bridge *br, __be16 proto,
 	list_for_each_entry(p, &br->port_list, list) {
 		vg = nbp_vlan_group(p);
 		list_for_each_entry(vlan, &vg->vlan_list, vlist) {
+			if (vlan->priv_flags & BR_VLFLAG_ADDED_BY_SWITCHDEV)
+				continue;
 			err = vlan_vid_add(p->dev, proto, vlan->vid);
 			if (err)
 				goto err_filt;
@@ -973,8 +975,11 @@ int __br_vlan_set_proto(struct net_bridge *br, __be16 proto,
 	/* Delete VLANs for the old proto from the device filter. */
 	list_for_each_entry(p, &br->port_list, list) {
 		vg = nbp_vlan_group(p);
-		list_for_each_entry(vlan, &vg->vlan_list, vlist)
+		list_for_each_entry(vlan, &vg->vlan_list, vlist) {
+			if (vlan->priv_flags & BR_VLFLAG_ADDED_BY_SWITCHDEV)
+				continue;
 			vlan_vid_del(p->dev, oldproto, vlan->vid);
+		}
 	}
 
 	return 0;
@@ -983,13 +988,19 @@ int __br_vlan_set_proto(struct net_bridge *br, __be16 proto,
 	attr.u.vlan_protocol = ntohs(oldproto);
 	switchdev_port_attr_set(br->dev, &attr, NULL);
 
-	list_for_each_entry_continue_reverse(vlan, &vg->vlan_list, vlist)
+	list_for_each_entry_continue_reverse(vlan, &vg->vlan_list, vlist) {
+		if (vlan->priv_flags & BR_VLFLAG_ADDED_BY_SWITCHDEV)
+			continue;
 		vlan_vid_del(p->dev, proto, vlan->vid);
+	}
 
 	list_for_each_entry_continue_reverse(p, &br->port_list, list) {
 		vg = nbp_vlan_group(p);
-		list_for_each_entry(vlan, &vg->vlan_list, vlist)
+		list_for_each_entry(vlan, &vg->vlan_list, vlist) {
+			if (vlan->priv_flags & BR_VLFLAG_ADDED_BY_SWITCHDEV)
+				continue;
 			vlan_vid_del(p->dev, proto, vlan->vid);
+		}
 	}
 
 	return err;
```

> 
> [0]:
> 
> unreferenced object 0xffff8881f6771200 (size 256):
>   comm "ip", pid 446855, jiffies 4298238841 (age 55.240s)
>   hex dump (first 32 bytes):
>     00 00 7f 0e 83 88 ff ff 00 00 00 00 00 00 00 00  ................
>     00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
>   backtrace:
>     [<00000000012819ac>] vlan_vid_add+0x437/0x750
>     [<00000000f2281fad>] __br_vlan_set_proto+0x289/0x920
>     [<000000000632b56f>] br_changelink+0x3d6/0x13f0
>     [<0000000089d25f04>] __rtnl_newlink+0x8ae/0x14c0
>     [<00000000f6276baf>] rtnl_newlink+0x5f/0x90
>     [<00000000746dc902>] rtnetlink_rcv_msg+0x336/0xa00
>     [<000000001c2241c0>] netlink_rcv_skb+0x11d/0x340
>     [<0000000010588814>] netlink_unicast+0x438/0x710
>     [<00000000e1a4cd5c>] netlink_sendmsg+0x788/0xc40
>     [<00000000e8992d4e>] sock_sendmsg+0xb0/0xe0
>     [<00000000621b8f91>] ____sys_sendmsg+0x4ff/0x6d0
>     [<000000000ea26996>] ___sys_sendmsg+0x12e/0x1b0
>     [<00000000684f7e25>] __sys_sendmsg+0xab/0x130
>     [<000000004538b104>] do_syscall_64+0x3d/0x90
>     [<0000000091ed9678>] entry_SYSCALL_64_after_hwframe+0x46/0xb0
> 
> 
> Regards,
> Vlad

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ