[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <fca35bce-b1ae-7137-4bf8-aa385c371be4@linux.ibm.com>
Date: Thu, 21 May 2020 16:39:03 +0530
From: Pratik Sampat <psampat@...ux.ibm.com>
To: Doug Smythies <dsmythies@...us.net>
Cc: linux-kernel@...r.kernel.org, linux-pm@...r.kernel.org,
rafael.j.wysocki@...el.com, peterz@...radead.org,
daniel.lezcano@...aro.org, ego@...ux.vnet.ibm.com,
svaidy@...ux.ibm.com, pratik.sampat@...ibm.com,
pratik.r.sampat@...il.com
Subject: Re: [RFC 0/1] Alternate history mechanism for the TEO governor
Hello Doug,
Thanks a lot for running these benchmarks on an Intel box.
On 17/05/20 11:41 pm, Doug Smythies wrote:
> On 2020.05.11 Pratik Rajesh Sampat wrote:
>> First RFC posting:https://lkml.org/lkml/2020/2/22/27
> Summary:
>
> On that thread I wrote:
>
> > I have done a couple of other tests with this patch set,
> > but nothing to report yet, as the differences have been
> > minor so far.
>
> I tried your tests, or as close as I could find, and still
> do not notice much difference.
That is quite unfortunate. At least it doesn't seem to regress.
Nevertheless, as Rafael suggested aging is crucial, this patch doesn't age
weights. I do have a version with aging but I had a lot of run to run variance
so I had refrained from posting that.
I'm tweaking around the logic for aging as well as distribution of weights,
hopefully that may help.
> For detail, but likely little added value, read on:
>
> Kernel: 5.7-rc4:
> "teo": unmodified kernel.
> "wtteo": with this patch added.
> "menu": the menu idle governor, for comparison.
> CPU frequency scaling driver: intel-cpufreq
> CPU frequency scaling governor: schedutil
> CPU idle driver: intel_idle
>
> ...
>
>> Benchmarks:
>> Schbench
>> --------
>> Benchmarks scheduler wakeup latencies
>>
>> 1. latency 99th percentile - usec
> I found a Phoronix schbench test.
> It defaults to 99.9th percentile.
>
> schbench (usec, 99.9th Latency Percentile, less is better)(8 workers)
>
> threads teo wtteo menu
> 2 14197 14194 99.98% 14467 101.90%
> 4 46733 46490 99.48% 46554 99.62%
> 6 57306 58291 101.72% 57754 100.78%
> 8 81408 80768 99.21% 81715 100.38%
> 16 157286 156570 99.54% 156621 99.58%
> 32 314573 310784 98.80% 315802 100.39%
>
> Powers and other idle statistics were similar. [1]
>
>> 2. Power - watts
>> Machine - IBM Power 9
>>
>> Latency and Power - Normalized
>> +---------+--------------+-----------------+---------------+
>> | Threads | TEO Baseline | Wt. TEO Latency | Wt. TEO Power |
>> +---------+--------------+-----------------+---------------+
>> | 2 | 100 | 101.3 | 85.29 |
>> +---------+--------------+-----------------+---------------+
>> | 4 | 100 | 105.06 | 113.63 |
>> +---------+--------------+-----------------+---------------+
>> | 8 | 100 | 92.32 | 90.36 |
>> +---------+--------------+-----------------+---------------+
>> | 16 | 100 | 99.1 | 92.43 |
>> +---------+--------------+-----------------+---------------+
>>
>> Accuracy
>>
>> Vanilla TEO Governor - Prediction distribution %
>> +---------+------+------+------+-------+-------+-------+---------+
>> | Threads | US 1 | US 2 | US 3 | US 4 | US 5 | US 6 | Correct |
>> +---------+------+------+------+-------+-------+-------+---------+
>> | 2 | 6.12 | 1.08 | 1.76 | 20.41 | 9.2 | 28.74 | 22.51 |
>> +---------+------+------+------+-------+-------+-------+---------+
>> | 4 | 8.54 | 1.56 | 1.25 | 20.24 | 10.75 | 25.17 | 22.67 |
>> +---------+------+------+------+-------+-------+-------+---------+
>> | 8 | 5.88 | 2.67 | 1.09 | 13.72 | 17.08 | 32.04 | 22.95 |
>> +---------+------+------+------+-------+-------+-------+---------+
>> | 16 | 6.29 | 2.43 | 0.86 | 13.21 | 15.33 | 26.52 | 29.34 |
>> +---------+------+------+------+-------+-------+-------+---------+
>> +---------+------+------+------+
>> | Threads | OS 1 | OS 2 | OS 3 |
>> +---------+------+------+------+
>> | 2 | 1.77 | 1.27 | 7.14 |
>> +---------+------+------+------+
>> | 4 | 1.8 | 1.31 | 6.71 |
>> +---------+------+------+------+
>> | 8 | 0.65 | 0.72 | 3.2 |
>> +---------+------+------+------+
>> | 16 | 0.63 | 1.71 | 3.68 |
>> +---------+------+------+------+
>>
>> Weighted TEO Governor - Prediction distribution %
>> +---------+------+------+------+-------+-------+-------+---------+
>> | Threads | US 1 | US 2 | US 3 | US 4 | US 5 | US 6 | Correct |
>> +---------+------+------+------+-------+-------+-------+---------+
>> | 2 | 7.26 | 2.07 | 0.02 | 15.85 | 13.29 | 36.26 | 22.13 |
>> +---------+------+------+------+-------+-------+-------+---------+
>> | 4 | 4.33 | 1.45 | 0.15 | 14.17 | 14.68 | 40.36 | 21.01 |
>> +---------+------+------+------+-------+-------+-------+---------+
>> | 8 | 4.73 | 2.46 | 0.12 | 12.48 | 14.68 | 32.38 | 28.9 |
>> +---------+------+------+------+-------+-------+-------+---------+
>> | 16 | 7.68 | 1.25 | 0.98 | 12.15 | 11.19 | 24.91 | 35.92 |
>> +---------+------+------+------+-------+-------+-------+---------+
>> +---------+------+------+------+
>> | Threads | OS 1 | OS 2 | OS 3 |
>> +---------+------+------+------+
>> | 2 | 0.39 | 0.42 | 2.31 |
>> +---------+------+------+------+
>> | 4 | 0.45 | 0.51 | 2.89 |
>> +---------+------+------+------+
>> | 8 | 0.53 | 0.66 | 3.06 |
>> +---------+------+------+------+
>> | 16 | 0.97 | 1.9 | 3.05 |
>> +---------+------+------+------+
>>
>> Sleeping Ebizzy
>> ---------------
>> Program to generate workloads resembling web server workloads.
>> The benchmark is customized to allow for a sleep interval -i
> I found a Phoronix ebizzy, but without the customization,
> which I suspect is important to demonstrate your potential
> improvement.
>
> Could you send me yours to try?
Sure thing, sleeping ebizzy is hosted here:
https://github.com/pratiksampat/sleeping-ebizzy
>
> ebizzy (records per second, more is better)
>
> teo wtteo menu
> 132344 132228 99.91% 130926 98.93%
>
> Powers and other idle statistics were similar. [2]
>
>> 1. Number of records
>> 2. Power - watts
>> Machine - IBM Power 9
>>
>> Parameters:
>> 1. -m -> Always use mmap instead of malloc
>> 2. -M -> Never use mmap
>> 3. -S <seconds> -> Number of seconds to run
>> 4. -i <interval> -> Sleep interval
> What are the units of this interval?
> They must be microseconds, as that is the only thing that makes sense.
Yes, it is in microseconds
> I have tried to simulate the resulting actual workflow
> myself, but didn't get results like yours. (I may have done a poorly.)
> My test does not produce performance data, as it just has to do its work
> before the next time to do a chunk of work.
> The test is:
>
> forever
> do 100 times
> very short sleep
> enddo
> sleep for 10 milliseconds
> endforever
Yes, In logic this is very similar to what benchmark emulates.
> The overheads result in enough activity.
> Powers and other idle statistics were similar. [3]
>
>> Number of records and power normalized
>> +-------------------+---------------+------------------+-----------------+
>> | Parameters | TEO baseline | Wt TEO records | Wt. TEO Power |
>> +-------------------+---------------+------------------+-----------------+
>> | -S 60 -i 10000 | 100 | 106.56 | 93.95 |
>> +-------------------+---------------+------------------+-----------------+
>> | -m -S 60 -i 10000 | 100 | 100.62 | 82.14 |
>> +-------------------+---------------+------------------+-----------------+
>> | -M -S 60 -i 10000 | 100 | 104.97 | 95.19 |
>> +-------------------+---------------+------------------+-----------------+
>>
>> Accuracy
>>
>> Vanilla TEO Governor - Prediction distribution %
>> +-------------------+-------+------+------+-------+------+-------+
>> | Parameters | US 1 | US 2 | US 3 | US 4 | US 5 | US 6 |
>> +-------------------+-------+------+------+-------+------+-------+
>> | -S 60 -i 10000 | 45.46 | 0.52 | 1.5 | 15.34 | 2.44 | 8.61 |
>> +-------------------+-------+------+------+-------+------+-------+
>> | -m -S 60 -i 10000 | 4.22 | 2.08 | 0.71 | 90.01 | 0 | 0.01 |
>> +-------------------+-------+------+------+-------+------+-------+
>> | -M -S 60 -i 10000 | 15.78 | 1.42 | 2.4 | 22.39 | 1.68 | 11.25 |
>> +-------------------+-------+------+------+-------+------+-------+
>> +-------------------+---------+------+------+------+------+
>> | Parameters | Correct | OS 1 | OS 2 | OS 3 | OS 4 |
>> +-------------------+---------+------+------+------+------+
>> | -S 60 -i 10000 | 17.03 | 1.73 | 1.1 | 6.27 | 0 |
>> +-------------------+---------+------+------+------+------+
>> | -m -S 60 -i 10000 | 2.44 | 0.18 | 0.13 | 0.22 | 0 |
>> +-------------------+---------+------+------+------+------+
>> | -M -S 60 -i 10000 | 31.65 | 3.45 | 1.8 | 8.18 | 0 |
>> +-------------------+---------+------+------+------+------+
>>
>> Weigted TEO Governor - Prediction distribution %
>> +-------------------+-------+------+------+-------+------+-------+
>> | Parameters | US 1 | US 2 | US 3 | US 4 | US 5 | US 6 |
>> +-------------------+-------+------+------+-------+------+-------+
>> | -S 60 -i 10000 | 8.25 | 0.87 | 0.98 | 19.23 | 4.05 | 26.35 |
>> +-------------------+-------+------+------+-------+------+-------+
>> | -m -S 60 -i 10000 | 7.69 | 4.35 | 0.93 | 82.74 | 0.01 | 0.01 |
>> +-------------------+-------+------+------+-------+------+-------+
>> | -M -S 60 -i 10000 | 3.73 | 3.29 | 0.73 | 13.33 | 7.38 | 18.61 |
>> +-------------------+-------+------+------+-------+------+-------+
>> +-------------------+---------+------+------+------+------+
>> | Parameters | Correct | OS 1 | OS 2 | OS 3 | OS 4 |
>> +-------------------+---------+------+------+------+------+
>> | -S 60 -i 10000 | 32.86 | 3.27 | 2.05 | 2.09 | 0 |
>> +-------------------+---------+------+------+------+------+
>> | -m -S 60 -i 10000 | 3.4 | 0.29 | 0.28 | 0.3 | 0 |
>> +-------------------+---------+------+------+------+------+
>> | -M -S 60 -i 10000 | 48.19 | 1.8 | 0.93 | 1.97 | 0.04 |
>> +-------------------+---------+------+------+------+------+
> For accuracy numbers, it would help to know the sample size
> and the importance.
>
> For this 60 second test, I wonder if the number of times
> each idle state was entered and exited was large enough to
> draw any conclusion. I often find for tests that some states are
> only used a few times in 1 minute, and so don't really care about the accuracy.
The sample size does go upto early double digit thousands but I don't really
know the physical importance of such a number.
So, I get what you're saying and maybe I need to benchmark with a longer duration
as your experience shows.
> Anyway, for my attempts that this test, I had to extend to a 5 minute sample
> time to get adequate numbers in all idle states for the accuracy statistics.
> (which showed no difference, by the way (for those not looking at the graphs).)
>
> For my test all three governors, teo, wtteo, and menu, were
> using idle state 0 about 7 to 8 thousand times per 5 minutes,
> and 100% of time the assessment was the state was too shallow.
> However, I don't really care because it is only 0.003% of the time,
> and if idle state 0 is disabled (teo-0disable on [3] (it is enabled
> again at minute 35), the power doesn't change.
>
> All that being said, your power/accuracy results do seem correlated.
>
This I believe is a good affirmation to have. I would be worried if
we predicted more correctly and somehow ended up doing worse or vise-versa.
>> Pgbench
>> -------
>> pgbench is a simple program for running benchmark tests on PostgreSQL.
>> It runs the same sequence of SQL commands over and over, possibly in
>> multiple concurrent database sessions, and then calculates the average
>> transaction rate (transactions per second).
> I did not try this test or anything similar.
> ...
>
>> Hackbench
>> ---------
>> Creates a specified number of pairs of schedulable entities
>> which communicate via either sockets or pipes and time how long it
>> takes for each pair to send data back and forth.
>>
> I found a Phoronix version, but it doesn't like
> your low loops counts, so I stayed with the default 50,000.
>
> I suspect your low loop count results in a workflow somewhat like
> your special ebizzy test. Anyway, maybe I should try your version
> and low loop counts.
>
> I did many tests, and get inconsistent results.
>
> You use these terms like "sockets" and "pipes", but
> the phoronix test uses "count" and "thread" or "process".
>
> I only used "process" for the simple reason that there was very
> very little use of idle at all with "thread", so there was no value
> in any test.
>
> hackbench test 1: all - process (seconds, less is better)
>
> test count teo wtteo menu
> 1 1 8.7 8.99 103.33% 9.071 104.26%
> 2 2 16.509 16.96 102.73% 17.159 103.94%
> 3 4 33.451 34.081 101.88% 34.101 101.94%
> 4 8 69.037 71.647 103.78% 69.914 101.27%
> 5 16 161.64 165.569 102.43% 165.015 102.09%
>
> Powers and other idle statistics were similar. [4]
>
> hackbench test 2: count 1 - process (seconds, less is better)
> teo wtteo menu
> average 8.906 8.703 97.72% 9.032 101.41%
> max 9.263 8.856 9.228
> min 8.761 8.599 8.876
> Std. Dev. 0.83% 0.46% 0.80%
> runs 256 256 200
>
> Powers and other idle statistics were similar. [5]
> However, idle state 3 is worthy of a look.
>
> hackbench test 3: count 2 - process (seconds, less is better)
> teo wtteo menu
> average 16.702 16.65 99.69% 16.796 100.56%
> max 16.853 16.966 17.058
> min 16.542 16.487 16.659
> Std. Dev. 0.41% 0.59% 0.56%
> runs 100 100 100
>
> Powers and other idle statistics were similar. [6]
> However, idle state 3 is worthy of a look.
>
>> Machine - IBM Power 9
>>
>> Scale of measurement:
>> 1. Time (s)
>> 2. Power (watts)
>> Time is normalized
>>
>> +---------+----------+----------------------+-------------------+
>> | Loops | TEO Time | Wt. TEO Time Sockets | Wt. TEO Time Pipe |
>> +---------+----------+----------------------+-------------------+
>> | 100 | 100 | 95.23 | 87.09 |
>> +---------+----------+----------------------+-------------------+
>> | 1000 | 100 | 105.81 | 98.67 |
>> +---------+----------+----------------------+-------------------+
>> | 10000 | 100 | 99.33 | 92.73 |
>> +---------+----------+----------------------+-------------------+
>> | 100000 | 100 | 98.88 | 101.99 |
>> +---------+----------+----------------------+-------------------+
>> | 1000000 | 100 | 100.04 | 100.2 |
>> +---------+----------+----------------------+-------------------+
>>
>> Power :Socket: Consistent between 135-140 watts for both TEO and Wt. TEO
>> Pipe: Consistent between 125-130 watts for both TEO and Wt. TEO
>>
>> Pratik Rajesh Sampat (1):
>> Weighted approach to gather and use history in TEO governor
>>
>> drivers/cpuidle/governors/teo.c | 96 +++++++++++++++++++++++++++++++--
>> 1 file changed, 91 insertions(+), 5 deletions(-)
>>
>> --
>> 2.17.1
> I also tried Giovanni's and Mel's mmtests, (uses idle states 0 and 1 a lot)
> but couldn't extract the performance report. [7]
>
> Old sweep test, which doesn't produce performance data. [8]
> Old system idle test. [9]
>
> [1]http://www.smythies.com/~doug/linux/idle/wtteo/schbench/
> [2]http://www.smythies.com/~doug/linux/idle/wtteo/ebizzy/
> [3]http://www.smythies.com/~doug/linux/idle/wtteo/pn01/
> [4]http://www.smythies.com/~doug/linux/idle/wtteo/hackbench/
> [5]http://www.smythies.com/~doug/linux/idle/wtteo/hackbench2/
> [6]http://www.smythies.com/~doug/linux/idle/wtteo/hackbench3/
> [7]http://www.smythies.com/~doug/linux/idle/wtteo/mmtests-udp/
> [8]http://www.smythies.com/~doug/linux/idle/wtteo/sweep/
> [9]http://www.smythies.com/~doug/linux/idle/wtteo/idle/
>
> ... Doug
>
>
Thanks again for these comprehensive results.
~ Pratik
Powered by blists - more mailing lists