[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <1481097428.5535.12.camel@redhat.com>
Date: Wed, 07 Dec 2016 08:57:08 +0100
From: Paolo Abeni <pabeni@...hat.com>
To: Eric Dumazet <eric.dumazet@...il.com>
Cc: David Miller <davem@...emloft.net>,
netdev <netdev@...r.kernel.org>,
Willem de Bruijn <willemb@...gle.com>,
Tom Herbert <tom@...bertland.com>
Subject: Re: [PATCH net-next] net: sock_rps_record_flow() is for connected
sockets
On Tue, 2016-12-06 at 22:47 -0800, Eric Dumazet wrote:
> On Tue, 2016-12-06 at 19:32 -0800, Eric Dumazet wrote:
> > A follow up patch will provide a static_key (Jump Label) since most
> > hosts do not even use RFS.
>
> Speaking of static_key, it appears we now have GRO on UDP, and this
> consumes a considerable amount of cpu cycles.
>
> Turning off GRO allows me to get +20 % more packets on my single UDP
> socket. (1.2 Mpps instead of 1.0 Mpps)
I see also an improvement for single flow tests disabling GRO, but on a
smaller scale (~5% if I recall correctly).
> Surely udp_gro_receive() should be bypassed if no UDP socket has
> registered a udp_sk(sk)->gro_receive handler
>
> And/or delay the inet_add_offload(&udpv{4|6}_offload, IPPROTO_UDP); to
> the first UDP sockets setting udp_sk(sk)->gro_receive handler,
> ie udp_encap_enable() and udpv6_encap_enable()
I had some patches adding explicit static keys for udp_gro_receive, but
they were ugly and I did not get that much gain (I measured ~1-2%
skipping udp_gro_receive only). I can try to refresh them anyway.
We have some experimental patches to implement GRO for plain UDP
connected sockets, using frag_list to preserve the individual skb len,
and deliver the packet to user space individually. With that I got
~3mpps with a single queue/user space sink - before the recent udp
improvements. I would like to present these patches on netdev soon (no
sooner than next week, anyway).
Cheers,
Paolo
Powered by blists - more mailing lists