lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Thu, 6 Feb 2014 14:40:17 -0800
From:	Tom Herbert <therbert@...gle.com>
To:	Eric Dumazet <eric.dumazet@...il.com>
Cc:	David Miller <davem@...emloft.net>,
	Linux Netdev List <netdev@...r.kernel.org>,
	Eric Dumazet <edumazet@...gle.com>
Subject: Re: [RFC PATCH] udp4: Don't take socket reference in receive path

On Thu, Feb 6, 2014 at 12:58 PM, Eric Dumazet <eric.dumazet@...il.com> wrote:
> On Thu, 2014-02-06 at 11:58 -0800, Tom Herbert wrote:
>> The reference counting in the UDP receive path is quite expensive for
>> a socket that is share amoungst CPUs. This is probably true for normal
>> sockets, but really is painful when just using the socket for
>> receive encapsulation.
>>
>> udp4_lib_lookup always takes a socket reference, and we also put back
>> the reference after calling udp_queue_rcv_skb in the normal receive
>> path, so the need for taking the reference seems to be to hold the
>> socket after doing rcu_read_unlock. This patch modifies udp_lib_lookup
>> to optionally take a reference and is always called with rcu_read_lock.
>> In udp4_lib_rcv we call lib_lookup and udp_queue_rcv under the
>> rcu_read_lock but without having taken the reference.
>>
>> Requesting comments because I suspect there are nuances to this!
>>
>> Signed-off-by: Tom Herbert <therbert@...gle.com>
>> ---
>
> Unfortunately this cant work.
>
> When I did the RCU implementation for TCP/UDP, we chose to use
> SLAB_DESTROY_BY_RCU.
>
> This meant we have to take a reference, then check again the keys for
> the lookup.
>
> If we remove SLAB_DESTROY_BY_RCU, we kill performance for short lived
> sessions, because of call_rcu() added latencies.
>
Thanks for the explanation.

> (One UDP socket is about 1024 bytes in memory, call_rcu() grace period
> is throwing away 1024 bytes from cpu caches)
>
> Sure, in your case you know your udp sessions are not short lived,
> but many applications used UDP for DNS lookups, using few packets per
> socket.
>
The rationale for SLAB_DESTROY_BY_RCU might be different for UDP than
TCP. For instance, in the DNS example small connected UDP flows are
more an issue on the client, the server (which is likely to have much
greater load) should be using unconnected sockets.

In any case, I am still looking for a way to address this. Like I said
in the commit log, this per packet cost for UDP processing is far too
high at least in encapsulation path. I thought about extending
SO__REUSEPORT to provide CPU affinity but that seems like overkill
with its own performance implications. Alternatively, we could have
fast path for the encapsulation using UDP offload model which bypass
sockets completely which seems unpleasant.

>
>
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Powered by blists - more mailing lists