lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Tue, 28 Oct 2008 21:42:00 +0100
From:	Eric Dumazet <dada1@...mosbay.com>
To:	David Miller <davem@...emloft.net>
CC:	shemminger@...tta.com, benny+usenet@...rsen.dk, minyard@....org,
	netdev@...r.kernel.org, paulmck@...ux.vnet.ibm.com,
	Christoph Lameter <clameter@....com>,
	Peter Zijlstra <a.p.zijlstra@...llo.nl>,
	Evgeniy Polyakov <johnpol@....mipt.ru>
Subject: [PATCH 2/2] udp: RCU handling for Unicast packets.

RCUification of UDP hash tables

Goals are :

1) Optimizing handling of incoming Unicast UDP frames, so that no memory
  writes should happen in the fast path. Using an array of rwlocks (one per
  slot for example is not an option in this regard)

  Note: Multicasts and broadcasts still will need to take a lock,
  because doing a full lockless lookup in this case is difficult.

2) No expensive operations in the socket bind/unhash phases :
   - No expensive synchronize_rcu() calls.

   - No added rcu_head in socket structure, increasing memory needs,
   but more important, forcing us to use call_rcu() calls,
   that have the bad property of making sockets structure cold.
   (rcu grace period between socket freeing and its potential reuse
    make this socket being cold in CPU cache).
   David did a previous patch using call_rcu() and noticed a 20%
   impact on TCP connection rates.
   Quoting Cristopher Lameter :
    "Right. That results in cacheline cooldown. You'd want to recycle
     the object as they are cache hot on a per cpu basis. That is screwed
     up by the delayed regular rcu processing. We have seen multiple
     regressions due to cacheline cooldown.
     The only choice in cacheline hot sensitive areas is to deal with the
     complexity that comes with SLAB_DESTROY_BY_RCU or give up on RCU."

   - Because udp sockets are allocated from dedicated kmem_cache,
   use of SLAB_DESTROY_BY_RCU can help here.

Theory of operation :
---------------------

As the lookup is lockfree (using rcu_read_lock()/rcu_read_unlock()),
special attention must be taken by readers and writers.

Use of SLAB_DESTROY_BY_RCU is tricky too, because a socket can be freed,
reused, inserted in a different chain or in worst case in the same chain
while readers could do lookups in the same time.

In order to avoid loops, a reader must check each socket found in a chain
really belongs to the chain the reader was traversing. If it finds a
mismatch, lookup must start again at the begining. This *restart* loop
is the reason we had to use rdlock for the multicast case, because
we dont want to send same message several times to the same socket.

We use RCU only for fast path. Thus, /proc/net/udp still take rdlocks.

Signed-off-by: Eric Dumazet <dada1@...mosbay.com>
---
 include/net/sock.h |   37 ++++++++++++++++++++++++++++++++++++-
 net/core/sock.c    |    3 ++-
 net/ipv4/udp.c     |   35 ++++++++++++++++++++++++++---------
 net/ipv4/udplite.c |    1 +
 net/ipv6/udp.c     |   25 ++++++++++++++++++-------
 net/ipv6/udplite.c |    1 +
 6 files changed, 84 insertions(+), 18 deletions(-)

View attachment "PATCH_UDP.2" of type "text/plain" (7212 bytes)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ