[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <a36d9afa-6939-2695-5f31-bc8777b5f27c@synopsys.com>
Date: Thu, 17 May 2018 14:24:42 +0100
From: Jose Abreu <Jose.Abreu@...opsys.com>
To: Florian Fainelli <f.fainelli@...il.com>,
David Miller <davem@...emloft.net>, <Jose.Abreu@...opsys.com>
CC: <netdev@...r.kernel.org>, <Joao.Pinto@...opsys.com>,
<Vitor.Soares@...opsys.com>, <peppe.cavallaro@...com>,
<alexandre.torgue@...com>
Subject: Re: [PATCH v2 net-next 00/12] net: stmmac: Clean-up and tune-up
Hi David, Florian,
Results of slowing down CPU follows bellow.
On 16-05-2018 20:01, Florian Fainelli wrote:
> On 05/16/2018 11:56 AM, David Miller wrote:
>> From: Jose Abreu <Jose.Abreu@...opsys.com>
>> Date: Wed, 16 May 2018 13:50:42 +0100
>>
>>> David raised some rightfull constrains about the use of indirect callbacks in
>>> the code. I did iperf tests with and without patches 3-12 and the performance
>>> remained equal. I guess for 1Gb/s and because my setup has a powerfull
>>> processor these patches don't affect the performance.
>> Does your cpu need Spectre v1 and v2 workarounds which cause indirect calls to
>> be extremely expensive?
> Given how widespread stmmac is within the ARM CPU's ecosystem, the
> answer is more than likely yes.
>
> To get a better feeling of whether your indirect branches introduce a
> difference, either don't run the CPU at full speed (e.g: use cpufreq to
> slow it down), and/or profile the number of cycles and instruction cache
> hits/miss ratio for the functions called in hot-path.
It turns out my CPU has every single vulnerability detected so far :D
---
# cat /sys/devices/system/cpu/vulnerabilities/meltdown
Mitigation: PTI
# cat /sys/devices/system/cpu/vulnerabilities/spectre_v1
Mitigation: __user pointer sanitization
# cat /sys/devices/system/cpu/vulnerabilities/spectre_v2
Vulnerable: Minimal generic ASM retpoline
---
I'm not sure if workaround is active for spectre_v2 though,
because it just says "vulnerable" ...
Now, I'm using an 8 core Intel running @ 3.4 GHz:
---
# cat /proc/cpuinfo | grep -i mhz
cpu MHz : 3988.358
cpu MHz : 3991.775
cpu MHz : 3995.003
cpu MHz : 3996.003
cpu MHz : 3995.113
cpu MHz : 3996.512
cpu MHz : 3954.454
cpu MHz : 3937.402
---
So, following Florian advice I turned off 7 cores and changed CPU
freq to the minimum allowed (800MHz):
---
# cat /sys/bus/cpu/devices/cpu0/cpufreq/scaling_min_freq
800000
---
---
# for file in /sys/bus/cpu/devices/cpu*/cpufreq/scaling_governor;
do echo userspace > $file; done
# for file in /sys/bus/cpu/devices/cpu*/cpufreq/scaling_setspeed;
do echo 800000 > $file; done
# echo 0 > /sys/devices/system/cpu/cpu1/online
# echo 0 > /sys/devices/system/cpu/cpu2/online
# echo 0 > /sys/devices/system/cpu/cpu3/online
# echo 0 > /sys/devices/system/cpu/cpu4/online
# echo 0 > /sys/devices/system/cpu/cpu5/online
# echo 0 > /sys/devices/system/cpu/cpu6/online
# echo 0 > /sys/devices/system/cpu/cpu7/online
---
---
# cat /proc/cpuinfo | grep -i mhz
cpu MHz : 900.076
---
And these are the iperf results:
---
*With* patches 3-12, 8xCPU @ 3.4GHz: iperf = 0.0-60.0 sec 6.62
GBytes 948 Mbits/sec 0.045 ms 37/4838564 (0.00076%)
*With* patches 3-12, 1xCPU @ 800MHz: iperf = 0.0-60.0 sec 6.62
GBytes 947 Mbits/sec 0.000 ms 18/4833009 (0%)
*Without* patches 3-12, 8xCPU @ 3.4GHz: iperf = 0.0-60.0 sec
6.60 GBytes 945 Mbits/sec 0.049 ms 31/4819455 (0.00064%)
*Without* patches 3-12, 1xCPU @ 800MHz: iperf = 0.0-60.0 sec
6.62 GBytes 948 Mbits/sec 0.000 ms 0/4837257 (0%)
---
Given that the difference between better/worst is < 1%, I think
we can conclude patches 3-13 don't affect the overall
performance. I didn't profile the cache hits/miss though ...
Any comments? Unfortunately I don't have access to an ARM board
to test this yet ...
Thanks and Best Regards,
Jose Miguel Abreu
Powered by blists - more mailing lists