lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <472A12D0.4070401@cosmosbay.com>
Date:	Thu, 01 Nov 2007 18:54:24 +0100
From:	Eric Dumazet <dada1@...mosbay.com>
To:	Stephen Hemminger <shemminger@...ux-foundation.org>
CC:	"David S. Miller" <davem@...emloft.net>,
	Linux Netdev List <netdev@...r.kernel.org>,
	Andi Kleen <ak@...e.de>,
	Arnaldo Carvalho de Melo <acme@...hat.com>
Subject: Re: [PATCH] INET : removes per bucket rwlock in tcp/dccp ehash table

Stephen Hemminger a écrit :
> On Thu, 01 Nov 2007 11:16:20 +0100
> Eric Dumazet <dada1@...mosbay.com> wrote:
> 
>> As done two years ago on IP route cache table (commit 
>> 22c047ccbc68fa8f3fa57f0e8f906479a062c426) , we can avoid using one lock per 
>> hash bucket for the huge TCP/DCCP hash tables.
>>
>> On a typical x86_64 platform, this saves about 2MB or 4MB of ram, for litle 
>> performance differences. (we hit a different cache line for the rwlock, but 
>> then the bucket cache line have a better sharing factor among cpus, since we 
>> dirty it less often)
>>
>> Using a 'small' table of hashed rwlocks should be more than enough to provide 
>> correct SMP concurrency between different buckets, without using too much 
>> memory. Sizing of this table depends on NR_CPUS and various CONFIG settings.
>>
>> This patch provides some locking abstraction that may ease a future work using 
>>   a different model for TCP/DCCP table.
>>
>> Signed-off-by: Eric Dumazet <dada1@...mosbay.com>
>>
>>   include/net/inet_hashtables.h |   40 ++++++++++++++++++++++++++++----
>>   net/dccp/proto.c              |   16 ++++++++++--
>>   net/ipv4/inet_diag.c          |    9 ++++---
>>   net/ipv4/inet_hashtables.c    |    7 +++--
>>   net/ipv4/inet_timewait_sock.c |   13 +++++-----
>>   net/ipv4/tcp.c                |   11 +++++++-
>>   net/ipv4/tcp_ipv4.c           |   11 ++++----
>>   net/ipv6/inet6_hashtables.c   |   19 ++++++++-------
>>   8 files changed, 89 insertions(+), 37 deletions(-)
>>
> 
> Longterm is there any chance of using rcu for this? Seems like
> it could be a big win.
> 

This was discussed in the past, and I even believe some patch was proposed, 
but some guys (including David) complained that RCU is well suited for 'mostly 
  read' structures.

On some web server workloads, TCP hash table is constantly accessed in write 
mode (socket creation, socket move to timewait state, socket  deleted...), and 
RCU added overhead and poor cache re-use (because sockets must be placed on 
RCU queue before reuse)

On these typical workload, hash table without RCU is still the best.

Longterm changes would rather be based on Robert Olsson suggestion last year 
(trie based lookups and unified IP/TCP cache)

Short term changes would be to be able to resize the TCP hash table (being 
small at boot, and be able to grow it if necessary). Its current size on 
modern machines is just insane.

-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ