netdev - Re: [PATCH net-next-2.6] bonding: allow arp_ip_targets to be on a separate vlan from bond device

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20091202212449.GL1639@gospo.rdu.redhat.com>
Date:	Wed, 2 Dec 2009 16:24:49 -0500
From:	Andy Gospodarek <andy@...yhouse.net>
To:	Jay Vosburgh <fubar@...ibm.com>
Cc:	netdev@...r.kernel.org
Subject: Re: [PATCH net-next-2.6] bonding: allow arp_ip_targets to be on a
	separate vlan from bond device

On Tue, Dec 01, 2009 at 01:28:13PM -0800, Jay Vosburgh wrote:
> Andy Gospodarek <andy@...yhouse.net> wrote:
> [...]
> >I am using arp_validate, actually.  I forgot that the arp_validate
> >option doesn't show up in the output of /proc/net/bonding/bondX and I
> >intended to have that in the subject, but somehow dropped it.
> 
> 	Ok, I was doing it wrong earlier; it works with arp_validate.
> I'm seeing one problem with tcpdump, though, which I'll get to in a
> minute.
> 
> 	Could you update the summary / changelog message to mention that
> this patch fixes the specific case of arp_validate + arp_ip_target on
> VLAN?
> 
> 	Second, in regards to this:
> 
> --- a/net/core/dev.c
> +++ b/net/core/dev.c
> @@ -2439,8 +2439,8 @@ int netif_receive_skb(struct sk_buff *skb)
>  		skb->skb_iif = skb->dev->ifindex;
> 
>  	null_or_orig = NULL;
> -	orig_dev = skb->dev;
> -	if (orig_dev->master) {
> +	orig_dev = __dev_get_by_index(dev_net(skb->dev),skb->skb_iif);
> +	if (orig_dev->master && !(skb->dev->priv_flags & IFF_802_1Q_VLAN)) {
>  		if (skb_bond_should_drop(skb))
>  			null_or_orig = orig_dev; /* deliver only exact match */
>  		else
> 
> 	Would it be useful to add a comment to the effect that VLAN
> packets are run through skb_bond_should_drop at the VLAN layer?
> 
> 	Lastly, in regards to this:
> 
> @@ -2492,7 +2492,7 @@ ncls:
>  			&ptype_base[ntohs(type) & PTYPE_HASH_MASK], list) {
>  		if (ptype->type == type &&
>  		    (ptype->dev == null_or_orig || ptype->dev == skb->dev ||
> -		     ptype->dev == orig_dev)) {
> +		     ptype->dev == orig_dev || ptype->dev == orig_dev->master)) {
>  			if (pt_prev)
>  				ret = deliver_skb(skb, pt_prev, orig_dev);
>  			pt_prev = ptype;
> 
> 	This is presumably here because orig_dev will now be the actual
> slave the packet arrived on, but we want to additionally deliver to the
> master, correct?
> 
> 	Lastly, tcpdump.
> 
> 	This patch appears to affect what traffic tcpdump of a slave or
> the bonding master itself will capture.  Previously, tcpdump of the
> active slave would see only the transmitted packets sent over the bond,
> and tcpdump of the inactive slave would see incoming Ethernet-layer
> multicast or broadcasts sent to its switch port.  Tcpdump on the master
> would see all sent and non-VLAN received traffic, and tcpdump of the
> VLAN interface over the master would see just the VLAN traffic.
> 
> 	After this change, tcpdump of the active slave or of the bonding
> master (bond0) sees both sent and received traffic for the VLAN, but
> nothing for the non-VLAN traffic other than incoming broadcast /
> multicasts.  This holds true whether or not a VLAN is configured.
> 
> 	I added a "ptype->dev == orig_dev->master" test to the ptype_all
> receive block in netif_receive_skb, but it didn't help.  At the moment,
> I'm not exactly sure why tcpdump breaks.
> 

Jay,

The issue was that that orig_dev was getting set to the active slave, so
your running tcpdump on the active slave made the conditional inside
this loop:

        list_for_each_entry_rcu(ptype, &ptype_all, list) {
                if (ptype->dev == null_or_orig || ptype->dev == skb->dev ||
                    ptype->dev == orig_dev) {
                        if (pt_prev)
                                ret = deliver_skb(skb, pt_prev, orig_dev);
                        pt_prev = ptype;
                }
        }

hit and deliver_skb was being called for all traffic coming toward
bond0.<vid>.  I'm not completely happy with this solutoin, but I think
it resolves both the original problem I was trying to solve and the
regression you discovered with your original patch.  Let me know if you
see everything working now like I do.

diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
index 726bd75..b1e3b2f 100644
--- a/drivers/net/bonding/bond_main.c
+++ b/drivers/net/bonding/bond_main.c
@@ -2697,6 +2697,19 @@ static int bond_arp_rcv(struct sk_buff *skb, struct net_device *dev, struct pack
 	bond = netdev_priv(dev);
 	read_lock(&bond->lock);
 
+	/*
+	 * We may have dev passed in as a vlan device, so make sure to get to the
+	 * core netdev before continuing.
+	 */
+	if (dev->priv_flags & IFF_802_1Q_VLAN) {
+		dev = vlan_dev_real_dev(dev);
+		/*
+		 * Don't necessarily trust passed in orig_dev since vlan accelerated
+		 * netdevs and bonding don't play well together.
+		 */
+		orig_dev = __dev_get_by_index(dev_net(skb->dev),skb->skb_iif);
+	}
+
 	pr_debug("bond_arp_rcv: bond %s skb->dev %s orig_dev %s\n",
 		bond->dev->name, skb->dev ? skb->dev->name : "NULL",
 		orig_dev ? orig_dev->name : "NULL");
diff --git a/net/8021q/vlan_core.c b/net/8021q/vlan_core.c
index e75a2f3..8d8a778 100644
--- a/net/8021q/vlan_core.c
+++ b/net/8021q/vlan_core.c
@@ -14,6 +14,7 @@ int __vlan_hwaccel_rx(struct sk_buff *skb, struct vlan_group *grp,
 	if (skb_bond_should_drop(skb))
 		goto drop;
 
+	skb->skb_iif = skb->dev->ifindex;
 	__vlan_hwaccel_put_tag(skb, vlan_tci);
 	skb->dev = vlan_group_get_device(grp, vlan_tci & VLAN_VID_MASK);
 
@@ -85,6 +86,7 @@ vlan_gro_common(struct napi_struct *napi, struct vlan_group *grp,
 	if (skb_bond_should_drop(skb))
 		goto drop;
 
+	skb->skb_iif = skb->dev->ifindex;
 	__vlan_hwaccel_put_tag(skb, vlan_tci);
 	skb->dev = vlan_group_get_device(grp, vlan_tci & VLAN_VID_MASK);
 
diff --git a/net/core/dev.c b/net/core/dev.c
index 5d131c2..9c3ba0d 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -2421,6 +2421,7 @@ int netif_receive_skb(struct sk_buff *skb)
 {
 	struct packet_type *ptype, *pt_prev;
 	struct net_device *orig_dev;
+	struct net_device *bond_dev;
 	struct net_device *null_or_orig;
 	int ret = NET_RX_DROP;
 	__be16 type;
@@ -2487,12 +2488,24 @@ ncls:
 	if (!skb)
 		goto out;
 
+	/*
+	 * A bonding interface with a VLAN on top doesn't play nicely when the
+	 * netdev in the bond is capable of stripping the VLAN tag for us.
+	 * Knowing the base bond device is important in the event that bond
+	 * control frames arrive with a VLAN tag, but need to be serviced by
+	 * a hook installed for the base bond device.
+	 */
+	bond_dev = skb->dev;
+	if ((bond_dev->priv_flags & IFF_802_1Q_VLAN) &&
+	    (vlan_dev_real_dev(bond_dev)->priv_flags & IFF_BONDING))
+		bond_dev = vlan_dev_real_dev(bond_dev);
+
 	type = skb->protocol;
 	list_for_each_entry_rcu(ptype,
 			&ptype_base[ntohs(type) & PTYPE_HASH_MASK], list) {
 		if (ptype->type == type &&
 		    (ptype->dev == null_or_orig || ptype->dev == skb->dev ||
-		     ptype->dev == orig_dev)) {
+		     ptype->dev == orig_dev || ptype->dev == bond_dev)) {
 			if (pt_prev)
 				ret = deliver_skb(skb, pt_prev, orig_dev);
 			pt_prev = ptype;
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html