lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Fri, 22 Jan 2021 10:40:05 +0100
From:   Eric Dumazet <edumazet@...gle.com>
To:     Lukas Wunner <lukas@...ner.de>
Cc:     Pablo Neira Ayuso <pablo@...filter.org>,
        Jozsef Kadlecsik <kadlec@...filter.org>,
        Florian Westphal <fw@...len.de>,
        netfilter-devel@...r.kernel.org, coreteam@...filter.org,
        netdev <netdev@...r.kernel.org>,
        Daniel Borkmann <daniel@...earbox.net>,
        Alexei Starovoitov <ast@...nel.org>,
        Thomas Graf <tgraf@...g.ch>,
        Laura Garcia Liebana <nevola@...il.com>,
        John Fastabend <john.fastabend@...il.com>
Subject: Re: [PATCH nf-next v4 1/5] net: sched: Micro-optimize egress handling

On Fri, Jan 22, 2021 at 9:55 AM Lukas Wunner <lukas@...ner.de> wrote:
>
> sch_handle_egress() returns either the skb or NULL to signal to its
> caller __dev_queue_xmit() whether a packet should continue to be
> processed.
>
> The skb is always non-NULL, otherwise __dev_queue_xmit() would hit a
> NULL pointer deref right at its top.
>
> But the compiler doesn't know that.  So if sch_handle_egress() signals
> success by returning the skb, the "if (!skb) goto out;" statement
> results in a gratuitous NULL pointer check in the Assembler output.
>
> Avoid by telling the compiler that __dev_queue_xmit() is never passed a
> NULL skb.  This also eliminates another gratuitous NULL pointer check in
> __dev_queue_xmit()
>   qdisc_pkt_len_init()
>     skb_header_pointer()
>       __skb_header_pointer()
>
> The speedup is barely measurable:
> Before: 1877 1875 1878 1874 1882 1873 Mb/sec
> After:  1877 1877 1880 1883 1888 1886 Mb/sec
>
> However we're about to add a netfilter egress hook to __dev_queue_xmit()
> and without the micro-optimization, it will result in a performance
> degradation which is indeed measurable:
> With netfilter hook:               1853 1852 1850 1848 1849 1851 Mb/sec
> With netfilter hook + micro-optim: 1874 1877 1881 1875 1876 1876 Mb/sec
>
> The performance degradation is caused by a JNE instruction ("if (skb)")
> being flipped to a JE instruction ("if (!skb)") once the netfilter hook
> is added.  The micro-optimization removes the test and jump instructions
> altogether.
>
> Measurements were performed on a Core i7-3615QM.  Reproducer:
> ip link add dev foo type dummy
> ip link set dev foo up
> tc qdisc add dev foo clsact
> tc filter add dev foo egress bpf da bytecode '1,6 0 0 0,'
> modprobe pktgen
> echo "add_device foo" > /proc/net/pktgen/kpktgend_3
> samples/pktgen/pktgen_bench_xmit_mode_queue_xmit.sh -i foo -n 400000000 -m "11:11:11:11:11:11" -d 1.1.1.1
>
> Signed-off-by: Lukas Wunner <lukas@...ner.de>
> Cc: John Fastabend <john.fastabend@...il.com>
> Cc: Daniel Borkmann <daniel@...earbox.net>
> Cc: Alexei Starovoitov <ast@...nel.org>
> Cc: Eric Dumazet <edumazet@...gle.com>
> Cc: Thomas Graf <tgraf@...g.ch>
> ---
>  net/core/dev.c | 1 +
>  1 file changed, 1 insertion(+)
>
> diff --git a/net/core/dev.c b/net/core/dev.c
> index 7afbb642e203..4c16b9932823 100644
> --- a/net/core/dev.c
> +++ b/net/core/dev.c
> @@ -4072,6 +4072,7 @@ struct netdev_queue *netdev_core_pick_tx(struct net_device *dev,
>   *      the BH enable code must have IRQs enabled so that it will not deadlock.
>   *          --BLG
>   */
> +__attribute__((nonnull(1)))
>  static int __dev_queue_xmit(struct sk_buff *skb, struct net_device *sb_dev)
>  {
>         struct net_device *dev = skb->dev;
> --
> 2.29.2
>

Interesting !

It is a bit sad the compilers do not automatically get this knowledge
from the very first instruction :

 struct net_device *dev = skb->dev;

I see this also makes qdisc_pkt_len_init() slightly faster because
this removes the if (!skb) test from __skb_header_pointer()

I guess we also could add this patch to benefit all
skb_header_pointer() callers :

diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
index 5f60c9e907c9d8eae1e85ae0329838383e3325df..db8774c50cc6ab99deaecdb1dfc31dc5306b9a37
100644
--- a/include/linux/skbuff.h
+++ b/include/linux/skbuff.h
@@ -3627,6 +3627,7 @@ __skb_header_pointer(const struct sk_buff *skb,
int offset,
        return buffer;
 }

+__attribute__((nonnull(1)))
 static inline void * __must_check
 skb_header_pointer(const struct sk_buff *skb, int offset, int len,
void *buffer)
 {

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ