[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <1288683057.2660.154.camel@edumazet-laptop>
Date: Tue, 02 Nov 2010 08:30:57 +0100
From: Eric Dumazet <eric.dumazet@...il.com>
To: Simon Horman <horms@...ge.net.au>
Cc: netdev@...r.kernel.org, Jay Vosburgh <fubar@...ibm.com>,
"David S. Miller" <davem@...emloft.net>
Subject: Re: bonding: flow control regression [was Re: bridging: flow
control regression]
Le mardi 02 novembre 2010 à 16:03 +0900, Simon Horman a écrit :
> On Tue, Nov 02, 2010 at 05:53:42AM +0100, Eric Dumazet wrote:
> > Le mardi 02 novembre 2010 à 11:06 +0900, Simon Horman a écrit :
> >
> > > Thanks for the explanation.
> > > I'm not entirely sure how much of a problem this is in practice.
> >
> > Maybe for virtual devices (tunnels, bonding, ...), it would make sense
> > to delay the orphaning up to the real device.
>
> That was my initial thought. Could you give me some guidance
> on how that might be done so I can try and make a patch to test?
>
> > But if the socket send buffer is very large, it would defeat the flow
> > control any way...
>
> I'm primarily concerned about a situation where
> UDP packets are sent as fast as possible, indefinitely.
> And in that scenario, I think it would need to be a rather large buffer.
>
Please try following patch, thanks.
drivers/net/bonding/bond_main.c | 1 +
include/linux/if.h | 3 +++
net/core/dev.c | 5 +++--
3 files changed, 7 insertions(+), 2 deletions(-)
diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
index bdb68a6..325931e 100644
--- a/drivers/net/bonding/bond_main.c
+++ b/drivers/net/bonding/bond_main.c
@@ -4714,6 +4714,7 @@ static void bond_setup(struct net_device *bond_dev)
bond_dev->flags |= IFF_MASTER|IFF_MULTICAST;
bond_dev->priv_flags |= IFF_BONDING;
bond_dev->priv_flags &= ~IFF_XMIT_DST_RELEASE;
+ bond_dev->priv_flags &= ~IFF_EARLY_ORPHAN;
if (bond->params.arp_interval)
bond_dev->priv_flags |= IFF_MASTER_ARPMON;
diff --git a/include/linux/if.h b/include/linux/if.h
index 1239599..7499a99 100644
--- a/include/linux/if.h
+++ b/include/linux/if.h
@@ -77,6 +77,9 @@
#define IFF_BRIDGE_PORT 0x8000 /* device used as bridge port */
#define IFF_OVS_DATAPATH 0x10000 /* device used as Open vSwitch
* datapath port */
+#define IFF_EARLY_ORPHAN 0x20000 /* early orphan skbs in
+ * dev_hard_start_xmit()
+ */
#define IF_GET_IFACE 0x0001 /* for querying only */
#define IF_GET_PROTO 0x0002
diff --git a/net/core/dev.c b/net/core/dev.c
index 35dfb83..eabf94d 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -2005,7 +2005,8 @@ int dev_hard_start_xmit(struct sk_buff *skb, struct net_device *dev,
if (dev->priv_flags & IFF_XMIT_DST_RELEASE)
skb_dst_drop(skb);
- skb_orphan_try(skb);
+ if (dev->priv_flags & IFF_EARLY_ORPHAN)
+ skb_orphan_try(skb);
if (vlan_tx_tag_present(skb) &&
!(dev->features & NETIF_F_HW_VLAN_TX)) {
@@ -5590,7 +5591,7 @@ struct net_device *alloc_netdev_mq(int sizeof_priv, const char *name,
INIT_LIST_HEAD(&dev->napi_list);
INIT_LIST_HEAD(&dev->unreg_list);
INIT_LIST_HEAD(&dev->link_watch_list);
- dev->priv_flags = IFF_XMIT_DST_RELEASE;
+ dev->priv_flags = IFF_XMIT_DST_RELEASE | IFF_EARLY_ORPHAN ;
setup(dev);
strcpy(dev->name, name);
return dev;
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Powered by blists - more mailing lists