lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <1353810665.2590.4774.camel@edumazet-glaptop>
Date:	Sat, 24 Nov 2012 18:31:05 -0800
From:	Eric Dumazet <eric.dumazet@...il.com>
To:	Jesper Dangaard Brouer <brouer@...hat.com>
Cc:	"David S. Miller" <davem@...emloft.net>,
	Florian Westphal <fw@...len.de>, netdev@...r.kernel.org,
	Pablo Neira Ayuso <pablo@...filter.org>,
	Thomas Graf <tgraf@...g.ch>, Cong Wang <amwang@...hat.com>,
	Patrick McHardy <kaber@...sh.net>,
	"Paul E. McKenney" <paulmck@...ux.vnet.ibm.com>,
	Herbert Xu <herbert@...dor.hengli.com.au>
Subject: Re: [RFC net-next PATCH V1 0/9] net: fragmentation performance
 scalability on NUMA/SMP systems

On Fri, 2012-11-23 at 14:08 +0100, Jesper Dangaard Brouer wrote:
> This patchset implements significant performance improvements for
> fragmentation handling in the kernel, with a focus on NUMA and SMP
> based systems.
> 
> Review:
> 
>  Please review these patches.  I have on purpose added comments in the
>  code with the "//" comments style.  These comments are to be removed
>  before applying.  They serve as a questions to, you, the reviewer.
> 
> The fragmentation code today:
> 
>  The fragmentation code "protects" kernel resources, by implementing
>  some memory resource limitation code.  This is centered around a
>  global readers-writer lock, and (per network namespace) an atomic mem
>  counter and a LRU (Least-Recently-Used) list.  (Although separate
>  global variables and namespace resources, are kept for IPv4, IPv6
>  and Netfilter reassembly.)
> 
>  The code tries to keep the memory usage between a high and low
>  threshold (see: /proc/sys/net/ipv4/ipfrag_{high,low}_thresh).  The
>  "evictor" code cleans up fragments, when the high threshold is
>  exceeded, and stops only, when the low threshold is reached.
> 
> The scalability problem:
> 
>  Having a global/central variable for a resource limit is obviously a
>  scalability issue on SMP systems, and even amplified on a NUMA based
>  system.
> 


But ... , what practical workload even use fragments ?

Sure, netperf -t UDP_STREAM uses frags, but its a benchmark.

The only heavy user was NFS in the days it was using UDP, a very long
time ago.

A single lost fragment means the whole packet is lost.

Another problem with fragments is the lack of 4-tuple hashing, as only
the first frag contains the dst/src ports.

Also there is the sysctl_ipfrag_max_dist issue...

Hint : many NIC provide TSO (TCP offload), but none provide UFO,
probably because there is no demand for it.



--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ