lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <1481097428.5535.12.camel@redhat.com>
Date:   Wed, 07 Dec 2016 08:57:08 +0100
From:   Paolo Abeni <pabeni@...hat.com>
To:     Eric Dumazet <eric.dumazet@...il.com>
Cc:     David Miller <davem@...emloft.net>,
        netdev <netdev@...r.kernel.org>,
        Willem de Bruijn <willemb@...gle.com>,
        Tom Herbert <tom@...bertland.com>
Subject: Re: [PATCH net-next] net: sock_rps_record_flow() is for connected
 sockets

On Tue, 2016-12-06 at 22:47 -0800, Eric Dumazet wrote:
> On Tue, 2016-12-06 at 19:32 -0800, Eric Dumazet wrote:
> > A follow up patch will provide a static_key (Jump Label) since most
> > hosts do not even use RFS.
> 
> Speaking of static_key, it appears we now have GRO on UDP, and this
> consumes a considerable amount of cpu cycles.
> 
> Turning off GRO allows me to get +20 % more packets on my single UDP
> socket. (1.2 Mpps instead of 1.0 Mpps)

I see also an improvement for single flow tests disabling GRO, but on a
smaller scale (~5% if I recall correctly).

> Surely udp_gro_receive() should be bypassed if no UDP socket has
> registered a udp_sk(sk)->gro_receive handler 
> 
> And/or delay the inet_add_offload(&udpv{4|6}_offload, IPPROTO_UDP); to
> the first UDP sockets setting udp_sk(sk)->gro_receive handler,
> ie udp_encap_enable() and udpv6_encap_enable()

I had some patches adding explicit static keys for udp_gro_receive, but
they were ugly and I did not get that much gain (I measured ~1-2%
skipping udp_gro_receive only). I can try to refresh them anyway.

We have some experimental patches to implement GRO for plain UDP
connected sockets, using frag_list to preserve the individual skb len,
and deliver the packet to user space individually. With that I got
~3mpps with a single queue/user space sink - before the recent udp
improvements. I would like to present these patches on netdev soon (no
sooner than next week, anyway).

Cheers,

Paolo

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ