lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Tue, 02 Nov 2010 08:30:57 +0100
From:	Eric Dumazet <eric.dumazet@...il.com>
To:	Simon Horman <horms@...ge.net.au>
Cc:	netdev@...r.kernel.org, Jay Vosburgh <fubar@...ibm.com>,
	"David S. Miller" <davem@...emloft.net>
Subject: Re: bonding: flow control regression [was Re: bridging: flow
 control regression]

Le mardi 02 novembre 2010 à 16:03 +0900, Simon Horman a écrit :
> On Tue, Nov 02, 2010 at 05:53:42AM +0100, Eric Dumazet wrote:
> > Le mardi 02 novembre 2010 à 11:06 +0900, Simon Horman a écrit :
> > 
> > > Thanks for the explanation.
> > > I'm not entirely sure how much of a problem this is in practice.
> > 
> > Maybe for virtual devices (tunnels, bonding, ...), it would make sense
> > to delay the orphaning up to the real device.
> 
> That was my initial thought. Could you give me some guidance
> on how that might be done so I can try and make a patch to test?
> 
> > But if the socket send buffer is very large, it would defeat the flow
> > control any way...
> 
> I'm primarily concerned about a situation where
> UDP packets are sent as fast as possible, indefinitely.
> And in that scenario, I think it would need to be a rather large buffer.
> 

Please try following patch, thanks.

 drivers/net/bonding/bond_main.c |    1 +
 include/linux/if.h              |    3 +++
 net/core/dev.c                  |    5 +++--
 3 files changed, 7 insertions(+), 2 deletions(-)

diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
index bdb68a6..325931e 100644
--- a/drivers/net/bonding/bond_main.c
+++ b/drivers/net/bonding/bond_main.c
@@ -4714,6 +4714,7 @@ static void bond_setup(struct net_device *bond_dev)
 	bond_dev->flags |= IFF_MASTER|IFF_MULTICAST;
 	bond_dev->priv_flags |= IFF_BONDING;
 	bond_dev->priv_flags &= ~IFF_XMIT_DST_RELEASE;
+	bond_dev->priv_flags &= ~IFF_EARLY_ORPHAN;
 
 	if (bond->params.arp_interval)
 		bond_dev->priv_flags |= IFF_MASTER_ARPMON;
diff --git a/include/linux/if.h b/include/linux/if.h
index 1239599..7499a99 100644
--- a/include/linux/if.h
+++ b/include/linux/if.h
@@ -77,6 +77,9 @@
 #define IFF_BRIDGE_PORT	0x8000		/* device used as bridge port */
 #define IFF_OVS_DATAPATH	0x10000	/* device used as Open vSwitch
 					 * datapath port */
+#define IFF_EARLY_ORPHAN	0x20000 /* early orphan skbs in
+					 * dev_hard_start_xmit()
+					 */
 
 #define IF_GET_IFACE	0x0001		/* for querying only */
 #define IF_GET_PROTO	0x0002
diff --git a/net/core/dev.c b/net/core/dev.c
index 35dfb83..eabf94d 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -2005,7 +2005,8 @@ int dev_hard_start_xmit(struct sk_buff *skb, struct net_device *dev,
 		if (dev->priv_flags & IFF_XMIT_DST_RELEASE)
 			skb_dst_drop(skb);
 
-		skb_orphan_try(skb);
+		if (dev->priv_flags & IFF_EARLY_ORPHAN)
+			skb_orphan_try(skb);
 
 		if (vlan_tx_tag_present(skb) &&
 		    !(dev->features & NETIF_F_HW_VLAN_TX)) {
@@ -5590,7 +5591,7 @@ struct net_device *alloc_netdev_mq(int sizeof_priv, const char *name,
 	INIT_LIST_HEAD(&dev->napi_list);
 	INIT_LIST_HEAD(&dev->unreg_list);
 	INIT_LIST_HEAD(&dev->link_watch_list);
-	dev->priv_flags = IFF_XMIT_DST_RELEASE;
+	dev->priv_flags = IFF_XMIT_DST_RELEASE | IFF_EARLY_ORPHAN ;
 	setup(dev);
 	strcpy(dev->name, name);
 	return dev;


--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ