netdev - Re: Fwd: UDP/IPv6 performance issue

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20131210171248.GA23216@order.stressinduktion.org>
Date:	Tue, 10 Dec 2013 18:12:48 +0100
From:	Hannes Frederic Sowa <hannes@...essinduktion.org>
To:	ajay seshadri <seshajay@...il.com>
Cc:	netdev <netdev@...r.kernel.org>
Subject: Re: Fwd: UDP/IPv6 performance issue

Hello!

On Tue, Dec 10, 2013 at 11:19:29AM -0500, ajay seshadri wrote:
> I have been testing network performance using my application and other
> third party tools like netperf on my systems that have 10G NIC Cards.
> It's a simple back to back setup with no switches in between.
> 
> I see about 15 to 20% performance degradation for UDP/IPv6 as compared
> to UDP/IPv4 for packets of size 1500.
> 
> On performing "perf top" analysis for ipv6 traffic, I identified the
> following functions as some hot functions:
> fib6_force_start_gc()

IPv6 Routing code is not as well optimized as the IPv4 one. But it is
strange to see fib6_force_start_gc() to be that high in perf top.

I guess you are sending the frames to distinct destinations each time? A
cached entry is created on each send in the fib and as soon as the maximum of
4096 is reached a gc is forced. This setting is tunable in
/proc/sys/net/ipv6/route/max_size.

> csum_partial_copy_generic()
> udp_v6_flush_pending_frames()
> dst_mtu()
> 
> csum_partial_copy_generic() shows up because my card doesn't support
> checksum offloading for ipv6 packets. In fact turning off rx / tx
> checksum offloading for ipv4 showed the same function in the "perf
> top" profile, but did not cause any performance degradation.
> 
> Now I am CPU bound on packets of size 1500 and I am not using GSO (for
> both IPv4 and IPv6). I tried twiddling with the route cache garbage
> collection timer values and tried to set the socket options to disable
> pmtu discovery and set the mtu for the socket, but it did not make any
> difference.

A cached entry will be inserted nontheless. If you don't hit the max_size
route entries limit I guess there could be a bug which triggers needless gc
invocation.

> I am wondering if this is a known performance issue or can I fine tune
> the system to match UDP / IPv4 performance with UDP / IPv6? As I am
> CPU bound, the functions I identified are using up CPU cycles that i
> could probably save.

Could you send me your send pattern so maybe I could try to reproduce it?

Greetings,

  Hannes

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html