lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Thu, 14 Dec 2017 09:29:46 +0100
From:   Paolo Abeni <pabeni@...hat.com>
To:     David Miller <davem@...emloft.net>
Cc:     netdev@...r.kernel.org, kraig@...gle.com, edumazet@...gle.com
Subject: Re: [RFC PATCH] reuseport: compute the ehash only if needed

Hi,

On Wed, 2017-12-13 at 15:08 -0500, David Miller wrote:
> From: Paolo Abeni <pabeni@...hat.com>
> Date: Tue, 12 Dec 2017 14:09:28 +0100
> 
> > When a reuseport socket group is using a BPF filter to distribute
> > the packets among the sockets, we don't need to compute any hash
> > value, but the current reuseport_select_sock() requires the
> > caller to compute such hash in advance.
> > 
> > This patch reworks reuseport_select_sock() to compute the hash value
> > only if needed - missing or failing BPF filter. Since different
> > hash functions have different argument types - ipv4 addresses vs ipv6
> > ones - to avoid over-complicate the interface, reuseport_select_sock()
> > is now a macro.
> > 
> > Additionally, the sk_reuseport test is move inside reuseport_select_sock,
> > to avoid some code duplication.
> > 
> > Overall this gives small but measurable performance improvement
> > under UDP flood while using SO_REUSEPORT + BPF.
> > 
> > Signed-off-by: Paolo Abeni <pabeni@...hat.com>
> 
> I don't doubt that this improves the case where the hash is elided, but
> I suspect it makes things slower othewise.
> 
> You're doing two function calls for an operation that used to require
> just one in the bottom of the call chain.
> 
> You're also putting something onto the stack that the compiler can't
> possibly optimize into purely using cpu registers to hold.

Thank you for the feedback.

I was unable to measure any performance regression for the hash based
demultiplexing, and I think that the number of function calls is
unchanged in such scenario (with vanilla kernel we have ehash() and
reuseport_select_sock(), with the patched one __reuseport_get_info()
and ehash()). 

I agree you are right about the additional stack usage introduced by
this patch.

Overall I see we need something better than this.

Thanks,

Paolo

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ