netdev - Re: [PATCH net-next 00/10] korina cleanups/optimizations

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Message-ID: <8b8853450781a3db3332e3416ce73c6b@advem.lv>
Date:   Wed, 25 Oct 2017 09:19:16 +0300
From:   Roman Yeryomin <roman@...em.lv>
To:     David Miller <davem@...emloft.net>
Cc:     f.fainelli@...il.com, netdev@...r.kernel.org
Subject: Re: [PATCH net-next 00/10] korina cleanups/optimizations

On 2017-10-22 23:57, Roman Yeryomin wrote:
> On 2017-10-16 00:05, David Miller wrote:
>> From: Roman Yeryomin <roman@...em.lv>
>> Date: Sun, 15 Oct 2017 19:46:02 +0300
>> 
>>> On 2017-10-15 19:38, Florian Fainelli wrote:
>>>> On October 15, 2017 9:22:26 AM PDT, Roman Yeryomin <roman@...em.lv>
>>>> wrote:
>>>>> TX optimizations have led to ~15% performance increase (35->40Mbps)
>>>>> in local tx usecase (tested with iperf v3.2).
>>>> Could you avoid empty commit messages and write a paragraph or two 
>>>> for
>>>> each commit that explains what and why are you changing? The changes
>>>> look fine but they lack any explanation.
>>> 
>>> I thought that short descriptions are already self explanatory and
>>> just didn't know what to write more.
>> 
>> "Optimize TX handlers."
>> 
>> In what way?  Why?  How are things improved?  Is it measurable?
>> etc.
> 
> OK, got the idea.
> However I think I would need some help with measuring performance
> difference reliably.
> On this CPU iperf3 tx takes most of the time (like 80-90%), thus even
> well optimized changes will be hard to see with iperf3 alone.
> I've tried using pktgen module. Although it shows much better numbers
> than iperf3 (~95Mbps vs. 40), results don't look like very
> stable/reliable, pps may differ by 10-15% easily between different
> runs.
> perf. I have limited experience with it but if I understand correctly,
> this CPU doesn't support neither cycles nor instructions counters. So
> not sure if perf would be useful here.
> 
>  Performance counter stats for 'system wide':
> 
>       10387.717082      cpu-clock (msec)          #    1.000 CPUs 
> utilized
>               2941      context-switches          #    0.283 K/sec
>                  0      cpu-migrations            #    0.000 K/sec
>                 60      page-faults               #    0.006 K/sec
>    <not supported>      cycles
>    <not supported>      instructions
>    <not supported>      branches
>    <not supported>      branch-misses
> 
>       10.388087500 seconds time elapsed
> 
> 
> What are the suggestions?

Any ideas?
Or I can just comment on the patch(es) which gave apparent performance 
improvement (as seen with iperf3) and others mark as cleanup.

Regards,
Roman