netdev - Re: [PATCH] INET : removes per bucket rwlock in tcp/dccp ehash table

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <472DAD90.4050709@cosmosbay.com>
Date:	Sun, 04 Nov 2007 12:31:28 +0100
From:	Eric Dumazet <dada1@...mosbay.com>
To:	David Miller <davem@...emloft.net>
CC:	ak@...e.de, netdev@...r.kernel.org, acme@...hat.com,
	Jarek Poplawski <jarkao2@...pl>
Subject: Re: [PATCH] INET : removes per bucket rwlock in tcp/dccp ehash table

David Miller a écrit :
> From: Andi Kleen <ak@...e.de>
> Date: Sun, 4 Nov 2007 00:18:14 +0100
> 
>> On Thursday 01 November 2007 11:16:20 Eric Dumazet wrote:
>>
>> Some quick comments:
>>
>>> +#if defined(CONFIG_SMP) || defined(CONFIG_PROVE_LOCKING)
>>> +/*
>>> + * Instead of using one rwlock for each inet_ehash_bucket, we use a table of locks
>>> + * The size of this table is a power of two and depends on the number of CPUS.
>>> + */
>> This shouldn't be hard coded based on NR_CPUS, but be done on runtime
>> based on num_possible_cpus(). This is better for kernels with a large
>> NR_CPUS, but which typically run on much smaller systems (like 
>> distribution kernels) 
> 
> I think this is a good idea.  Eric, could you make this change?

Yes of course, since using a non constant value for masking is cheap.

But I suspect distributions kernels enable CONFIG_HOTPLUG_CPU so 
num_possible_cpus() will be NR_CPUS.

> 
>> Also the EHASH_LOCK_SZ == 0 special case is a little strange. Why did
>> you add that?
> 
> He explained this in another reply, because ifdefs are ugly.

This will vanish if done on runtime anyway.

> 
>> And as a unrelated node have you tried converting the rwlocks 
>> into normal spinlocks? spinlocks should be somewhat cheaper
>> because they have less cache protocol overhead and with
>> the huge thash tables in Linux the chain walks should be short
>> anyways so not doing this in parallel is probably not a big issue.
>> At some point I also had a crazy idea of using a special locking
>> scheme that special cases the common case that a hash chain
>> has only one member and doesn't take a look for that at all. 
> 
> I agree.
> 
> There was movement at one point to get rid of all rwlock's in the
> kernel, I personally think they are pointless.  Any use that makes
> "sense" is a case where the code should be rewritten to decrease the
> lock hold time or convert to RCU.
> 

I agree too, rwlocks are more expensive when contention is low, so let do this 
rwlock->spinlock change on next step (separate patch), because it means 
changing also lhash_lock.

Thanks to Jarek, I added locks cleanup in dccp_fini()

[PATCH] INET : removes per bucket rwlock in tcp/dccp ehash table

As done two years ago on IP route cache table (commit 
22c047ccbc68fa8f3fa57f0e8f906479a062c426) , we can avoid using one lock per 
hash bucket for the huge TCP/DCCP hash tables.

On a typical x86_64 platform, this saves about 2MB or 4MB of ram, for litle 
performance differences. (we hit a different cache line for the rwlock, but 
then the bucket cache line have a better sharing factor among cpus, since we 
dirty it less often). For netstat or ss commands that want a full scan of hash 
table, we perform fewer memory accesses.

Using a 'small' table of hashed rwlocks should be more than enough to provide 
correct SMP concurrency between different buckets, without using too much 
memory. Sizing of this table depends on num_possible_cpus() and various CONFIG 
settings.

This patch provides some locking abstraction that may ease a future work using 
  a different model for TCP/DCCP table.

Signed-off-by: Eric Dumazet <dada1@...mosbay.com>
Acked-by: Arnaldo Carvalho de Melo <acme@...hat.com>

  include/net/inet_hashtables.h |   71 +++++++++++++++++++++++++++++---
  net/dccp/proto.c              |    9 +++-
  net/ipv4/inet_diag.c          |    9 ++--
  net/ipv4/inet_hashtables.c    |    7 +--
  net/ipv4/inet_timewait_sock.c |   13 +++--
  net/ipv4/tcp.c                |    4 -
  net/ipv4/tcp_ipv4.c           |   11 ++--
  net/ipv6/inet6_hashtables.c   |   19 ++++----
  8 files changed, 106 insertions(+), 37 deletions(-)

View attachment "tcp_ehash_locks.patch" of type "text/plain" (14143 bytes)