lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [day] [month] [year] [list]
Message-ID: <20101211224018.GA2547@verge.net.au>
Date:	Sun, 12 Dec 2010 07:40:20 +0900
From:	Simon Horman <horms@...ge.net.au>
To:	Eric Dumazet <eric.dumazet@...il.com>
Cc:	netdev@...r.kernel.org, Ben Hutchings <bhutchings@...arflare.com>
Subject: Re: [PATCH] rfc: ethtool: early-orphan control

On Sat, Dec 11, 2010 at 06:11:20PM +0100, Eric Dumazet wrote:
> Le samedi 11 décembre 2010 à 09:03 +0100, Eric Dumazet a écrit :
> > Le samedi 11 décembre 2010 à 13:24 +0900, Simon Horman a écrit :
> > > On Sat, Dec 11, 2010 at 01:13:35PM +0900, Simon Horman wrote:
> > > > Early orphaning is an optimisation which avoids unnecessary cache misses by
> > > > orphaning an skb just before it is handed to a device for transmit thus
> > > > avoiding the case where the orphaning occurs on a different CPU.
> > > > 
> > > > In the case of bonded devices this has the unfortunate side-effect of
> > > > breaking down flow control allowing a socket to send UDP packets as fast as
> > > > the CPU will allow. This is particularly undesirable in virtualised
> > > > network environments.
> > > > 
> > > > This patch introduces ethtool control of early orphaning.
> > > > It remains on by default by it now may be disabled on a per-interface basis.
> > > > 
> > > > I have implemented this as a generic flag.
> > > > As it seems to be the first generic flag that requires
> > > > no driver awareness I also supplied a default flag handler.
> > > > I am unsure if any aspect of this approach is acceptable.
> > > > 
> > > > I believe Eric has it in mind that some of the calls
> > > > to skb_orphan() in drivers can be removed with the addition
> > > > of this feature. I need to discuss that with him further.
> > > > 
> > > > A patch for the ethtool user-space utility accompanies this patch.
> > > 
> > > The following results were measured using kvm using virto without vhost net.
> > > The virtio device is bridged to a bond device which has one gigabit slave.
> > > 
> > 
> > As you know, vhost net does the orphaning, as well as some NIC drivers,
> > so one UDP flood would have same problem.
> > 
> > I wonder if this problem could not be solved in other ways.
> > 
> > 
> > We might do early orphaning only for sockets with SOCK_USE_WRITE_QUEUE
> > flag asserted. (tcp sets it)
> > 
> > Then, we could also say : Why tcp use sock_wfree() at all...
> > 
> 
> I removed skb_orphan_try() and did a quick test, with bonding or not,
> same results on a Gigabit interface.
> 
> $ netperf -C -c -4 -t UDP_STREAM -H 55.225.18.57 -- -m 1000
> UDP UNIDIRECTIONAL SEND TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 55.225.18.57 (55.225.18.57) port 0 AF_INET
> Socket  Message  Elapsed      Messages                   CPU      Service
> Size    Size     Time         Okay Errors   Throughput   Util     Demand
> bytes   bytes    secs            #      #   10^6bits/sec % SS     us/KB
> 
> 10000000    1000   10.00     6611385      0     5289.0     13.18    9.278 
> 1000000           10.00     1163454             930.7     4.58     6.456 
> 
> 
> As soon as 'socket size' is big enough, UDP flow control is ineffective,
> and no error is reported to user. sendto() says all frames were properly sent.
> 

Yes, I've done that test too (as you suggested previously). But my thought
was that in a virtualised environment the administrator of the host can set
the socket size to be small enough and the guest can't change it.

However, I now realise that the same effect can be produced
in the guest's network stack by increasing wmem_default there.
So I'm not sure that this change is useful after all. And I've
got a worse flow control problem than I previously realised.

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ