netdev - Re: [PATCH v2 net-next 00/12] net: stmmac: Clean-up and tune-up

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite for Android: free password hash cracker in your pocket

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <a36d9afa-6939-2695-5f31-bc8777b5f27c@synopsys.com>
Date:   Thu, 17 May 2018 14:24:42 +0100
From:   Jose Abreu <Jose.Abreu@...opsys.com>
To:     Florian Fainelli <f.fainelli@...il.com>,
        David Miller <davem@...emloft.net>, <Jose.Abreu@...opsys.com>
CC:     <netdev@...r.kernel.org>, <Joao.Pinto@...opsys.com>,
        <Vitor.Soares@...opsys.com>, <peppe.cavallaro@...com>,
        <alexandre.torgue@...com>
Subject: Re: [PATCH v2 net-next 00/12] net: stmmac: Clean-up and tune-up

Hi David, Florian,

Results of slowing down CPU follows bellow.

On 16-05-2018 20:01, Florian Fainelli wrote:
> On 05/16/2018 11:56 AM, David Miller wrote:
>> From: Jose Abreu <Jose.Abreu@...opsys.com>
>> Date: Wed, 16 May 2018 13:50:42 +0100
>>
>>> David raised some rightfull constrains about the use of indirect callbacks in
>>> the code. I did iperf tests with and without patches 3-12 and the performance
>>> remained equal. I guess for 1Gb/s and because my setup has a powerfull
>>> processor these patches don't affect the performance.
>> Does your cpu need Spectre v1 and v2 workarounds which cause indirect calls to
>> be extremely expensive?
> Given how widespread stmmac is within the ARM CPU's ecosystem, the
> answer is more than likely yes.
>
> To get a better feeling of whether your indirect branches introduce a
> difference, either don't run the CPU at full speed (e.g: use cpufreq to
> slow it down), and/or profile the number of cycles and instruction cache
> hits/miss ratio for the functions called in hot-path.

It turns out my CPU has every single vulnerability detected so far :D

---
# cat /sys/devices/system/cpu/vulnerabilities/meltdown
Mitigation: PTI
# cat /sys/devices/system/cpu/vulnerabilities/spectre_v1
Mitigation: __user pointer sanitization
# cat /sys/devices/system/cpu/vulnerabilities/spectre_v2
Vulnerable: Minimal generic ASM retpoline
---

I'm not sure if workaround is active for spectre_v2 though,
because it just says "vulnerable" ...

Now, I'm using an 8 core Intel running @ 3.4 GHz:

---
# cat /proc/cpuinfo | grep -i mhz
cpu MHz         : 3988.358
cpu MHz         : 3991.775
cpu MHz         : 3995.003
cpu MHz         : 3996.003
cpu MHz         : 3995.113
cpu MHz         : 3996.512
cpu MHz         : 3954.454
cpu MHz         : 3937.402
---

So, following Florian advice I turned off 7 cores and changed CPU
freq to the minimum allowed (800MHz):

---
# cat /sys/bus/cpu/devices/cpu0/cpufreq/scaling_min_freq
800000
---

---
# for file in /sys/bus/cpu/devices/cpu*/cpufreq/scaling_governor;
do echo userspace > $file; done
# for file in /sys/bus/cpu/devices/cpu*/cpufreq/scaling_setspeed;
do echo 800000 > $file; done
# echo 0 > /sys/devices/system/cpu/cpu1/online
# echo 0 > /sys/devices/system/cpu/cpu2/online
# echo 0 > /sys/devices/system/cpu/cpu3/online
# echo 0 > /sys/devices/system/cpu/cpu4/online
# echo 0 > /sys/devices/system/cpu/cpu5/online
# echo 0 > /sys/devices/system/cpu/cpu6/online
# echo 0 > /sys/devices/system/cpu/cpu7/online
---

---
# cat /proc/cpuinfo | grep -i mhz
cpu MHz         : 900.076
---

And these are the iperf results:

---
*With* patches 3-12, 8xCPU @ 3.4GHz: iperf = 0.0-60.0 sec  6.62
GBytes   948 Mbits/sec   0.045 ms   37/4838564 (0.00076%)
*With* patches 3-12, 1xCPU @ 800MHz: iperf = 0.0-60.0 sec  6.62
GBytes   947 Mbits/sec   0.000 ms   18/4833009 (0%)
*Without* patches 3-12, 8xCPU @ 3.4GHz: iperf = 0.0-60.0 sec 
6.60 GBytes   945 Mbits/sec   0.049 ms   31/4819455 (0.00064%)
*Without* patches 3-12, 1xCPU @ 800MHz: iperf = 0.0-60.0 sec 
6.62 GBytes   948 Mbits/sec   0.000 ms    0/4837257 (0%)
---

Given that the difference between better/worst is < 1%, I think
we can conclude patches 3-13 don't affect the overall
performance. I didn't profile the cache hits/miss though ...

Any comments? Unfortunately I don't have access to an ARM board
to test this yet ...

Thanks and Best Regards,
Jose Miguel Abreu