[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20091209220131.GO1639@gospo.rdu.redhat.com>
Date: Wed, 9 Dec 2009 17:01:31 -0500
From: Andy Gospodarek <andy@...yhouse.net>
To: Eric Dumazet <eric.dumazet@...il.com>
Cc: Andy Gospodarek <andy@...yhouse.net>,
Jay Vosburgh <fubar@...ibm.com>, netdev@...r.kernel.org
Subject: Re: [PATCH net-next-2.6 v3] bonding: allow arp_ip_targets on
separate vlans to use arp validation
On Mon, Dec 07, 2009 at 07:24:57PM +0100, Eric Dumazet wrote:
> Andy Gospodarek a écrit :
>
> >> Jay,
> >>
> >> The issue was that that orig_dev was getting set to the active slave, so
> >> your running tcpdump on the active slave made the conditional inside
> >> this loop:
> >>
> >> list_for_each_entry_rcu(ptype, &ptype_all, list) {
> >> if (ptype->dev == null_or_orig || ptype->dev == skb->dev ||
> >> ptype->dev == orig_dev) {
> >> if (pt_prev)
> >> ret = deliver_skb(skb, pt_prev, orig_dev);
> >> pt_prev = ptype;
> >> }
> >> }
> >>
> >> hit and deliver_skb was being called for all traffic coming toward
> >> bond0.<vid>. I'm not completely happy with this solutoin, but I think
> >> it resolves both the original problem I was trying to solve and the
> >> regression you discovered with your original patch. Let me know if you
> >> see everything working now like I do.
> >>
> >> diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
> >> index 726bd75..b1e3b2f 100644
> >> --- a/drivers/net/bonding/bond_main.c
> >> +++ b/drivers/net/bonding/bond_main.c
> >> @@ -2697,6 +2697,19 @@ static int bond_arp_rcv(struct sk_buff *skb, struct net_device *dev, struct pack
> >> bond = netdev_priv(dev);
> >> read_lock(&bond->lock);
> >>
> >> + /*
> >> + * We may have dev passed in as a vlan device, so make sure to get to the
> >> + * core netdev before continuing.
> >> + */
> >> + if (dev->priv_flags & IFF_802_1Q_VLAN) {
> >> + dev = vlan_dev_real_dev(dev);
> >> + /*
> >> + * Don't necessarily trust passed in orig_dev since vlan accelerated
> >> + * netdevs and bonding don't play well together.
> >> + */
> >> + orig_dev = __dev_get_by_index(dev_net(skb->dev),skb->skb_iif);
> >> + }
> >> +
>
> >
> > Any thoughts on the updated patch, Jay?
>
> Unfortunately you still use __dev_get_by_index()
> in a non safe context.
>
I wasn't completely happy with the last patch, so I reworked it a bit
and addressed your concern regarding __dev_get_by_index usage. I
replaced it with dev_get_by_index_rcu. This should be safe since all
calls to deliver_skb are protected by rcu_read_lock. I also decided to
have the device modification happen in the frame handler rather than in
netif_receive_skb. Here is the updated patch:
[PATCH net-next-2.6 v3] bonding: allow arp_ip_targets on separate vlans to use arp validation
This allows a bond device to specify an arp_ip_target as a host that is
not on the same vlan as the base bond device and still use arp
validation. A configuration like this, now works:
BONDING_OPTS="mode=active-backup arp_interval=1000 arp_ip_target=10.0.100.1 arp_validate=3"
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 16436 qdisc noqueue
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
inet6 ::1/128 scope host
valid_lft forever preferred_lft forever
2: eth1: <BROADCAST,MULTICAST,SLAVE,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast master bond0 qlen 1000
link/ether 00:13:21:be:33:e9 brd ff:ff:ff:ff:ff:ff
3: eth0: <BROADCAST,MULTICAST,SLAVE,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast master bond0 qlen 1000
link/ether 00:13:21:be:33:e9 brd ff:ff:ff:ff:ff:ff
8: bond0: <BROADCAST,MULTICAST,MASTER,UP,LOWER_UP> mtu 1500 qdisc noqueue
link/ether 00:13:21:be:33:e9 brd ff:ff:ff:ff:ff:ff
inet6 fe80::213:21ff:febe:33e9/64 scope link
valid_lft forever preferred_lft forever
9: bond0.100@...d0: <BROADCAST,MULTICAST,MASTER,UP,LOWER_UP> mtu 1500 qdisc noqueue
link/ether 00:13:21:be:33:e9 brd ff:ff:ff:ff:ff:ff
inet 10.0.100.2/24 brd 10.0.100.255 scope global bond0.100
inet6 fe80::213:21ff:febe:33e9/64 scope link
valid_lft forever preferred_lft forever
Ethernet Channel Bonding Driver: v3.5.0 (November 4, 2008)
Bonding Mode: fault-tolerance (active-backup)
Primary Slave: None
Currently Active Slave: eth1
MII Status: up
MII Polling Interval (ms): 0
Up Delay (ms): 0
Down Delay (ms): 0
ARP Polling Interval (ms): 1000
ARP IP target/s (n.n.n.n form): 10.0.100.1
Slave Interface: eth1
MII Status: up
Link Failure Count: 1
Permanent HW addr: 00:40:05:30:ff:30
Slave Interface: eth0
MII Status: up
Link Failure Count: 0
Permanent HW addr: 00:13:21:be:33:e9
---
drivers/net/bonding/bond_main.c | 11 +++++++++++
net/8021q/vlan_core.c | 2 ++
net/core/dev.c | 17 +++++++++++++++--
3 files changed, 28 insertions(+), 2 deletions(-)
diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
index af9b9c4..537e365 100644
--- a/drivers/net/bonding/bond_main.c
+++ b/drivers/net/bonding/bond_main.c
@@ -2691,6 +2691,17 @@ static int bond_arp_rcv(struct sk_buff *skb, struct net_device *dev, struct pack
unsigned char *arp_ptr;
__be32 sip, tip;
+ if (dev->priv_flags & IFF_802_1Q_VLAN) {
+ /*
+ * When using VLANS and bonding, dev and oriv_dev may be
+ * incorrect if the physical interface supports VLAN
+ * acceleration. With this change ARP validation now
+ * works for hosts only reachable on the VLAN interface.
+ */
+ dev = vlan_dev_real_dev(dev);
+ orig_dev = dev_get_by_index_rcu(dev_net(skb->dev),skb->skb_iif);
+ }
+
if (!(dev->priv_flags & IFF_BONDING) || !(dev->flags & IFF_MASTER))
goto out;
diff --git a/net/8021q/vlan_core.c b/net/8021q/vlan_core.c
index e75a2f3..c0316e0 100644
--- a/net/8021q/vlan_core.c
+++ b/net/8021q/vlan_core.c
@@ -14,6 +14,7 @@ int __vlan_hwaccel_rx(struct sk_buff *skb, struct vlan_group *grp,
if (skb_bond_should_drop(skb))
goto drop;
+ skb->skb_iif = skb->dev->ifindex;
__vlan_hwaccel_put_tag(skb, vlan_tci);
skb->dev = vlan_group_get_device(grp, vlan_tci & VLAN_VID_MASK);
@@ -85,6 +86,7 @@ vlan_gro_common(struct napi_struct *napi, struct vlan_group *grp,
if (skb_bond_should_drop(skb))
goto drop;
+ skb->skb_iif = skb->dev->ifindex;
__vlan_hwaccel_put_tag(skb, vlan_tci);
skb->dev = vlan_group_get_device(grp, vlan_tci & VLAN_VID_MASK);
diff --git a/net/core/dev.c b/net/core/dev.c
index c36a17a..ef62d2d 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -2422,6 +2422,7 @@ int netif_receive_skb(struct sk_buff *skb)
struct packet_type *ptype, *pt_prev;
struct net_device *orig_dev;
struct net_device *null_or_orig;
+ struct net_device *null_or_bond;
int ret = NET_RX_DROP;
__be16 type;
@@ -2487,12 +2488,24 @@ ncls:
if (!skb)
goto out;
+ /*
+ * Make sure frames received on VLAN interfaces stacked on
+ * bonding interfaces still make their way to any base bonding
+ * device that may have registered for a specific ptype. The
+ * handler will have to adjust skb->dev and orig_dev though.
+ */
+ null_or_bond = NULL;
+ if ((skb->dev->priv_flags & IFF_802_1Q_VLAN) &&
+ (vlan_dev_real_dev(skb->dev)->priv_flags & IFF_BONDING)) {
+ null_or_bond = vlan_dev_real_dev(skb->dev);
+ }
+
type = skb->protocol;
list_for_each_entry_rcu(ptype,
&ptype_base[ntohs(type) & PTYPE_HASH_MASK], list) {
if (ptype->type == type &&
- (ptype->dev == null_or_orig || ptype->dev == skb->dev ||
- ptype->dev == orig_dev)) {
+ (ptype->dev == null_or_orig || ptype->dev == null_or_bond ||
+ ptype->dev == skb->dev || ptype->dev == orig_dev)) {
if (pt_prev)
ret = deliver_skb(skb, pt_prev, orig_dev);
pt_prev = ptype;
--
1.6.2.5
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Powered by blists - more mailing lists