linux-kernel - Re: [PATCH 1/3] sched: remove select_idle

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Date:   Wed, 2 May 2018 14:58:42 -0700
From:   Subhra Mazumdar <subhra.mazumdar@...cle.com>
To:     Peter Zijlstra <peterz@...radead.org>
Cc:     linux-kernel@...r.kernel.org, mingo@...hat.com,
        daniel.lezcano@...aro.org, steven.sistare@...cle.com,
        dhaval.giani@...cle.com, rohit.k.jain@...cle.com
Subject: Re: [PATCH 1/3] sched: remove select_idle_core() for scalability



On 05/01/2018 11:03 AM, Peter Zijlstra wrote:
> On Mon, Apr 30, 2018 at 04:38:42PM -0700, Subhra Mazumdar wrote:
>> I also noticed a possible bug later in the merge code. Shouldn't it be:
>>
>> if (busy < best_busy) {
>>          best_busy = busy;
>>          best_cpu = first_idle;
>> }
> Uhh, quite. I did say it was completely untested, but yes.. /me dons the
> brown paper bag.
I re-ran the test after fixing that bug but still get similar regressions
for hackbench, while similar improvements on Uperf. I didn't re-run the
Oracle DB tests but my guess is it will show similar improvement.

merge:

Hackbench process on 2 socket, 44 core and 88 threads Intel x86 machine
(lower is better):
groups  baseline       %stdev  patch %stdev
1       0.5742         21.13   0.5131 (10.64%) 4.11
2       0.5776         7.87    0.5387 (6.73%) 2.39
4       0.9578         1.12    1.0549 (-10.14%) 0.85
8       1.7018         1.35    1.8516 (-8.8%) 1.56
16      2.9955         1.36    3.2466 (-8.38%) 0.42
32      5.4354         0.59    5.7738 (-6.23%) 0.38

Uperf pingpong on 2 socket, 44 core and 88 threads Intel x86 machine with
message size = 8k (higher is better):
threads baseline        %stdev  patch %stdev
8       49.47           0.35    51.1 (3.29%) 0.13
16      95.28           0.77    98.45 (3.33%) 0.61
32      156.77          1.17    170.97 (9.06%) 5.62
48      193.24          0.22    245.89 (27.25%) 7.26
64      216.21          9.33    316.43 (46.35%) 0.37
128     379.62          10.29   337.85 (-11%) 3.68

I tried using the next_cpu technique with the merge but didn't help. I am
open to suggestions.

merge + next_cpu:

Hackbench process on 2 socket, 44 core and 88 threads Intel x86 machine
(lower is better):
groups  baseline       %stdev  patch %stdev
1       0.5742         21.13   0.5107 (11.06%) 6.35
2       0.5776         7.87    0.5917 (-2.44%) 11.16
4       0.9578         1.12    1.0761 (-12.35%) 1.1
8       1.7018         1.35    1.8748 (-10.17%) 0.8
16      2.9955         1.36    3.2419 (-8.23%) 0.43
32      5.4354         0.59    5.6958 (-4.79%) 0.58

Uperf pingpong on 2 socket, 44 core and 88 threads Intel x86 machine with
message size = 8k (higher is better):
threads baseline        %stdev  patch %stdev
8       49.47           0.35    51.65 (4.41%) 0.26
16      95.28           0.77    99.8 (4.75%) 1.1
32      156.77          1.17    168.37 (7.4%) 0.6
48      193.24          0.22    228.8 (18.4%) 1.75
64      216.21          9.33    287.11 (32.79%) 10.82
128     379.62          10.29   346.22 (-8.8%) 4.7

Finally there was earlier suggestion by Peter in select_task_rq_fair to
transpose the cpu offset that I had tried earlier but also regressed on
hackbench. Just wanted to mention that so we have closure on that.

transpose cpu offset in select_task_rq_fair:

Hackbench process on 2 socket, 44 core and 88 threads Intel x86 machine
(lower is better):
groups  baseline       %stdev  patch %stdev
1       0.5742         21.13   0.5251 (8.55%) 2.57
2       0.5776         7.87    0.5471 (5.28%) 11
4       0.9578         1.12    1.0148 (-5.95%) 1.97
8       1.7018         1.35    1.798 (-5.65%) 0.97
16      2.9955         1.36    3.088 (-3.09%) 2.7
32      5.4354         0.59    5.2815 (2.8%) 1.26