netdev - Re: [PATCH net-next 00/10] korina cleanups/optimizations

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Date:   Sun, 22 Oct 2017 23:57:55 +0300
From:   Roman Yeryomin <roman@...em.lv>
To:     David Miller <davem@...emloft.net>
Cc:     f.fainelli@...il.com, netdev@...r.kernel.org
Subject: Re: [PATCH net-next 00/10] korina cleanups/optimizations

On 2017-10-16 00:05, David Miller wrote:
> From: Roman Yeryomin <roman@...em.lv>
> Date: Sun, 15 Oct 2017 19:46:02 +0300
> 
>> On 2017-10-15 19:38, Florian Fainelli wrote:
>>> On October 15, 2017 9:22:26 AM PDT, Roman Yeryomin <roman@...em.lv>
>>> wrote:
>>>> TX optimizations have led to ~15% performance increase (35->40Mbps)
>>>> in local tx usecase (tested with iperf v3.2).
>>> Could you avoid empty commit messages and write a paragraph or two 
>>> for
>>> each commit that explains what and why are you changing? The changes
>>> look fine but they lack any explanation.
>> 
>> I thought that short descriptions are already self explanatory and
>> just didn't know what to write more.
> 
> "Optimize TX handlers."
> 
> In what way?  Why?  How are things improved?  Is it measurable?
> etc.

OK, got the idea.
However I think I would need some help with measuring performance 
difference reliably.
On this CPU iperf3 tx takes most of the time (like 80-90%), thus even 
well optimized changes will be hard to see with iperf3 alone.
I've tried using pktgen module. Although it shows much better numbers 
than iperf3 (~95Mbps vs. 40), results don't look like very 
stable/reliable, pps may differ by 10-15% easily between different runs.
perf. I have limited experience with it but if I understand correctly, 
this CPU doesn't support neither cycles nor instructions counters. So 
not sure if perf would be useful here.

  Performance counter stats for 'system wide':

       10387.717082      cpu-clock (msec)          #    1.000 CPUs 
utilized
               2941      context-switches          #    0.283 K/sec
                  0      cpu-migrations            #    0.000 K/sec
                 60      page-faults               #    0.006 K/sec
    <not supported>      cycles
    <not supported>      instructions
    <not supported>      branches
    <not supported>      branch-misses

       10.388087500 seconds time elapsed


What are the suggestions?


Regards,
Roman