netdev - Re: [net-next PATCH] net: reduce cycles spend on ICMP replies that gets rate limited

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Message-ID: <20170107113133.227f3c29@redhat.com>
Date:   Sat, 7 Jan 2017 11:31:33 +0100
From:   Jesper Dangaard Brouer <brouer@...hat.com>
To:     David Miller <davem@...emloft.net>
Cc:     eric.dumazet@...il.com, netdev@...r.kernel.org, brouer@...hat.com,
        xiyou.wangcong@...il.com
Subject: Re: [net-next PATCH] net: reduce cycles spend on ICMP replies that
 gets rate limited

On Fri, 06 Jan 2017 22:10:42 -0500 (EST)
David Miller <davem@...emloft.net> wrote:

> BTW Eric, you asked about kmalloc() allocation, you were CC:'d in the
> patch which did this :-)
> 
> commit 9a99d4a50cb8ce516adf0f2436138d4c8e6e4535
> Author: Cong Wang <amwang@...hat.com>
> Date:   Sun Jun 2 15:00:52 2013 +0000
> 
>     icmp: avoid allocating large struct on stack
>     
>     struct icmp_bxm is a large struct, reduce stack usage
>     by allocating it on heap.
>     
>     Cc: Eric Dumazet <eric.dumazet@...il.com>
>     Cc: Joe Perches <joe@...ches.com>
>     Cc: David S. Miller <davem@...emloft.net>
>     Signed-off-by: Cong Wang <amwang@...hat.com>
>     Signed-off-by: David S. Miller <davem@...emloft.net>

Did a quick revert, and tested again.  It is not the major bottleneck,
but we do save something.  The major bottleneck is still the call to
__ip_route_output_key_hash (invoked by icmp_route_lookup).

Single flow improvement from 1719182 pps to 1783368 pps.
 - 64186 pps
 - (1/1783368-1/1719182)*10^9 = -20.93 nanosec
   * 4GHz approx = 20.93*4 = 83.72 cycles

The optimal SLUB fast-path on this machine is 54 cycles(tsc) 13.557 ns,
thus the saving is actually higher than expected.  But low compared to
avoiding the icmp_route_lookup.

-- 
Best regards,
  Jesper Dangaard Brouer
  MSc.CS, Principal Kernel Engineer at Red Hat
  LinkedIn: http://www.linkedin.com/in/brouer