lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <5650fdb7-5b86-9c7d-c112-1ff5ee7812b7@iogearbox.net>
Date:   Tue, 29 Jan 2019 12:17:14 +0100
From:   Daniel Borkmann <daniel@...earbox.net>
To:     bjorn.topel@...il.com, intel-wired-lan@...ts.osuosl.org
Cc:     Björn Töpel <bjorn.topel@...el.com>,
        pmenzel@...gen.mpg.de, brouer@...hat.com,
        magnus.karlsson@...el.com, magnus.karlsson@...il.com,
        netdev@...r.kernel.org, alexei.starovoitov@...il.com,
        davem@...emloft.net
Subject: Re: [PATCH v2] i40e: replace switch-statement to speed-up
 retpoline-enabled builds

On 01/29/2019 10:57 AM, bjorn.topel@...il.com wrote:
> From: Björn Töpel <bjorn.topel@...el.com>
> 
> GCC will generate jump tables for switch-statements with more than 5
> case statements. An entry into the jump table is an indirect call,
> which means that for CONFIG_RETPOLINE builds, this is rather
> expensive.
> 
> This commit replaces the switch-statement that acts on the XDP program
> result with an if-clause.
> 
> The if-clause was also refactored into a common function that can be
> used by AF_XDP zero-copy and non-zero-copy code.
> 
> Performance prior this patch:
> $ sudo ./xdp_rxq_info --dev enp134s0f0 --action XDP_DROP
> Running XDP on dev:enp134s0f0 (ifindex:7) action:XDP_DROP options:no_touch
> XDP stats       CPU     pps         issue-pps
> XDP-RX CPU      20      18983018    0
> XDP-RX CPU      total   18983018
> 
> RXQ stats       RXQ:CPU pps         issue-pps
> rx_queue_index   20:20  18983012    0
> rx_queue_index   20:sum 18983012
> 
> $ sudo ./xdpsock -i enp134s0f0 -q 20 -n 2 -z -r
>  sock0@...134s0f0:20 rxdrop
>                 pps         pkts        2.00
> rx              14,641,496  144,751,092
> tx              0           0
> 
> And after:
> $ sudo ./xdp_rxq_info --dev enp134s0f0 --action XDP_DROP
> Running XDP on dev:enp134s0f0 (ifindex:7) action:XDP_DROP options:no_touch
> XDP stats       CPU     pps         issue-pps
> XDP-RX CPU      20      24000986    0
> XDP-RX CPU      total   24000986
> 
> RXQ stats       RXQ:CPU pps         issue-pps
> rx_queue_index   20:20  24000985    0
> rx_queue_index   20:sum 24000985
> 
>   +26%
> 
> $ sudo ./xdpsock -i enp134s0f0 -q 20 -n 2 -z -r
>  sock0@...134s0f0:20 rxdrop
>                 pps         pkts        2.00
> rx              17,623,578  163,503,263
> tx              0           0
> 
>   +20%
> 
> Signed-off-by: Björn Töpel <bjorn.topel@...el.com>

Looks good. Given the performance improvements, wondering in general whether
it would make sense to raise the default limit for generating jump tables if
we have CONFIG_RETPOLINE enabled; as in:

diff --git a/arch/x86/Makefile b/arch/x86/Makefile
index 9c5a67d..33495a9 100644
--- a/arch/x86/Makefile
+++ b/arch/x86/Makefile
@@ -217,6 +217,8 @@ KBUILD_CFLAGS += -fno-asynchronous-unwind-tables
 # Avoid indirect branches in kernel to deal with Spectre
 ifdef CONFIG_RETPOLINE
   KBUILD_CFLAGS += $(RETPOLINE_CFLAGS)
+  # Avoid generating slow indirect jumps for small number of switch cases
+  KBUILD_CFLAGS += --param case-values-threshold=12
 endif

 archscripts: scripts_basic

That would likely bloat the kernel a bit also in slow-path places where it
would not be needed, but it would generically catch majority of cases. I'll
run some experiments later today (but in any case that should not block this
patch here).

Cheers,
Daniel

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ