lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Mon, 14 Aug 2017 18:19:57 +0200
From:   Jesper Dangaard Brouer <brouer@...hat.com>
To:     Paweł Staszewski <pstaszewski@...are.pl>
Cc:     Linux Kernel Network Developers <netdev@...r.kernel.org>,
        brouer@...hat.com, Alexander Duyck <alexander.duyck@...il.com>
Subject: Re: Kernel 4.13.0-rc4-next-20170811 - IP Routing / Forwarding
 performance vs Core/RSS number / HT on


On Sun, 13 Aug 2017 18:58:58 +0200 Paweł Staszewski <pstaszewski@...are.pl> wrote:

> To show some difference below comparision vlan/no-vlan traffic
> 
> 10Mpps forwarded traffic vith no-vlan vs 6.9Mpps with vlan

I'm trying to reproduce in my testlab (with ixgbe).  I do see, a
performance reduction of about 10-19% when I forward out a VLAN
interface.  This is larger than I expected, but still lower than what
you reported 30-40% slowdown.

[...]

> >>> perf top:
> >>>
> >>>    PerfTop:   77835 irqs/sec  kernel:99.7%  
> >>> ---------------------------------------------
> >>>
> >>>       16.32%  [kernel]       [k] skb_dst_force
> >>>       16.30%  [kernel]       [k] dst_release
> >>>       15.11%  [kernel]       [k] rt_cache_valid
> >>>       12.62%  [kernel]       [k] ipv4_mtu  
> >> It seems a little strange that these 4 functions are on the top  

I don't see these in my test.

> >>  
> >>>        5.60%  [kernel]       [k] do_raw_spin_lock  
> >> Why is calling/taking this lock? (Use perf call-graph recording).  
> > can be hard to paste it here:)
> > attached file

The attached was very big. Please don't attach so big file on mailing
lists.  Next time plase share them via e.g. pastebin. The output was a
capture from your terminal, which made the output more difficult to
read.  Hint: You can/could use perf --stdio and place it in a file
instead.

The output (extracted below) didn't show who called 'do_raw_spin_lock',
BUT it showed another interesting thing.  The kernel code
__dev_queue_xmit() in might create route dst-cache problem for itself(?),
as it will first call skb_dst_force() and then skb_dst_drop() when the
packet is transmitted on a VLAN.

 static int __dev_queue_xmit(struct sk_buff *skb, void *accel_priv)
 {
 [...]
	/* If device/qdisc don't need skb->dst, release it right now while
	 * its hot in this cpu cache.
	 */
	if (dev->priv_flags & IFF_XMIT_DST_RELEASE)
		skb_dst_drop(skb);
	else
		skb_dst_force(skb);


- - 
Best regards,
  Jesper Dangaard Brouer
  MSc.CS, Principal Kernel Engineer at Red Hat
  LinkedIn: http://www.linkedin.com/in/brouer

Extracted part of attached perf output:

 --5.37%--ip_rcv_finish
   |          
   |--4.02%--ip_forward
   |   |          
   |    --3.92%--ip_forward_finish
   |       |          
   |        --3.91%--ip_output
   |          |          
   |           --3.90%--ip_finish_output
   |              |          
   |               --3.88%--ip_finish_output2
   |                  |          
   |                   --2.77%--neigh_connected_output
   |                     |          
   |                      --2.74%--dev_queue_xmit
   |                         |          
   |                          --2.73%--__dev_queue_xmit
   |                             |          
   |                             |--1.66%--dev_hard_start_xmit
   |                             |   |          
   |                             |    --1.64%--vlan_dev_hard_start_xmit
   |                             |       |          
   |                             |        --1.63%--dev_queue_xmit
   |                             |           |          
   |                             |            --1.62%--__dev_queue_xmit
   |                             |               |          
   |                             |               |--0.99%--skb_dst_drop.isra.77
   |                             |               |   |          
   |                             |               |   --0.99%--dst_release
   |                             |               |          
   |                             |                --0.55%--sch_direct_xmit
   |                             |          
   |                              --0.99%--skb_dst_force
   |          
    --1.29%--ip_route_input_noref
        |          
         --1.29%--ip_route_input_rcu
             |          
              --1.05%--rt_cache_valid

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ