lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <5550E3C9.1030306@plumgrid.com>
Date:	Mon, 11 May 2015 10:15:53 -0700
From:	Alexei Starovoitov <ast@...mgrid.com>
To:	Daniel Borkmann <daniel@...earbox.net>,
	Pablo Neira Ayuso <pablo@...filter.org>
CC:	netdev@...r.kernel.org, davem@...emloft.net, jhs@...atatu.com
Subject: Re: [PATCH 2/2 net-next] net: move qdisc ingress filtering code where
 it belongs

On 5/11/15 5:58 AM, Daniel Borkmann wrote:
> diff --git a/net/core/dev.c b/net/core/dev.c
>
> -static inline struct sk_buff *handle_ing(struct sk_buff *skb,
> +static __always_inline struct sk_buff *handle_ing(struct sk_buff *skb,
>                        struct packet_type **pt_prev,
>                        int *ret, struct net_device *orig_dev)
>   {
>       struct netdev_queue *rxq = rcu_dereference(skb->dev->ingress_queue);
> +    int i = 0;
> +
> +    printk("XXX %d\n", i++);
> +    printk("XXX %d\n", i++);
   .. lots of printk...

that an interesting test! Tried it out as well:

current baseline:
37711847pps 18101Mb/sec (18101686560bps) errors: 10000000
37776912pps 18132Mb/sec (18132917760bps) errors: 10000000
37700180pps 18096Mb/sec (18096086400bps) errors: 10000000
37730169pps 18110Mb/sec (18110481120bps) errors: 10000000

with massive printk bloating in _inlined_ handle_ing:
37744223pps 18117Mb/sec (18117227040bps) errors: 10000000
37718786pps 18105Mb/sec (18105017280bps) errors: 10000000
37742087pps 18116Mb/sec (18116201760bps) errors: 10000000
37727777pps 18109Mb/sec (18109332960bps) errors: 10000000

no performance difference as expected and matches what Daniel is seeing.

Then I've tried to do 'noinline' for handle_ing():
36818072pps 17672Mb/sec (17672674560bps) errors: 10000000
36828761pps 17677Mb/sec (17677805280bps) errors: 10000000
36840106pps 17683Mb/sec (17683250880bps) errors: 10000000
36885403pps 17704Mb/sec (17704993440bps) errors: 10000000

this drop when static_key suppose to protect handle_ing()
was totally unexpected.
So I started digging into assembler before and after.
Turned out that with inlined handle_ing GCC can see what is
happening with pt_prev and ret pointers, so with handle_ing
inlined the asm looks like:
movl    $1, %r15d       #, ret
xorl    %r12d, %r12d    # pt_prev

when handle_ing is not inlined, the asm of netif_receive_skb has:
movl    $1, -68(%rbp)   #, ret
movq    $0, -64(%rbp)   #, pt_prev

To test it further I've tried:
+static noinline struct sk_buff *handle_ing_finish(struct sk_buff *skb,
+                                                 struct tcf_proto *cl)
...
static inline struct sk_buff *handle_ing(struct sk_buff *skb,
+                                        struct packet_type **pt_prev,
+                                        int *ret, struct net_device 
*orig_dev)
+{
+       struct tcf_proto *cl = 
rcu_dereference_bh(skb->dev->ingress_cl_list);
+
+       if (!cl)
+               return skb;
+       if (*pt_prev) {
+               *ret = deliver_skb(skb, *pt_prev, orig_dev);
+               *pt_prev = NULL;
+       }
+       return handle_ing_finish(skb, cl);
+}

so tc ingress part would not be inlined, but deliver_skb bits are.
The performance went back to normal:
37701570pps 18096Mb/sec (18096753600bps) errors: 10000000
37752444pps 18121Mb/sec (18121173120bps) errors: 10000000
37719331pps 18105Mb/sec (18105278880bps) errors: 10000000

Unfortunately this last experiment hurts ingress+u32 case
that dropped from 25.2 Mpps to 24.5 Mpps.

Will keep digging into it more. Stay tuned.

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ