linux-kernel - Re: [PATCH] fix scheduler regression from "sched/fair: Rework load

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <0014CA62-A632-495A-92B0-4B14C8CA193C@fb.com>
Date:   Mon, 26 Oct 2020 08:45:27 -0400
From:   "Chris Mason" <clm@...com>
To:     Vincent Guittot <vincent.guittot@...aro.org>
CC:     Peter Zijlstra <peterz@...radead.org>,
        Johannes Weiner <hannes@...xchg.org>,
        Rik van Riel <riel@...riel.com>,
        linux-kernel <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH] fix scheduler regression from "sched/fair: Rework
 load_balance()"

On 26 Oct 2020, at 4:39, Vincent Guittot wrote:

> Hi Chris
>
> On Sat, 24 Oct 2020 at 01:49, Chris Mason <clm@...com> wrote:
>>
>> Hi everyone,
>>
>> We’re validating a new kernel in the fleet, and compared with v5.2,
>
> Which version are you using ?
> several improvements have been added since v5.5 and the rework of 
> load_balance

We’re validating v5.6, but all of the numbers referenced in this patch 
are against v5.9.  I usually try to back port my way to victory on this 
kind of thing, but mainline seems to behave exactly the same as 
0b0695f2b34a wrt this benchmark.

>
>> performance is ~2-3% lower for some of our workloads.  After some
>> digging, Johannes found that our involuntary context switch rate was 
>> ~2x
>> higher, and we were leaving a CPU idle a higher percentage of the 
>> time,
>> even though the workload was trying to saturate the system.
>>
>> We were able to reproduce the problem with schbench, and Johannes
>> bisected down to:
>>
>> commit 0b0695f2b34a4afa3f6e9aa1ff0e5336d8dad912
>> Author: Vincent Guittot <vincent.guittot@...aro.org>
>> Date:   Fri Oct 18 15:26:31 2019 +0200
>>
>>      sched/fair: Rework load_balance()
>>
>> Our working theory is the load balancing changes are leaving 
>> processes
>> behind busy CPUs instead of moving them onto idle ones.  I made a few
>> schbench modifications to make this easier to demonstrate:
>>
>> https://git.kernel.org/pub/scm/linux/kernel/git/mason/schbench.git/
>>
>> My VM has 40 cpus (20 cores, 2 threads per core), and my schbench
>> command line is:
>
> What is the topology ? are they all part of the same LLC ?

We’ve seen the regression on both single socket and dual socket bare 
metal intel systems.  On the VM I reproduced with, I saw similar 
latencies with and without siblings configured into the topology.

-chris