lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Fri, 26 Apr 2019 10:15:04 -0400
From:   Phil Auld <pauld@...hat.com>
To:     Ingo Molnar <mingo@...nel.org>
Cc:     Mel Gorman <mgorman@...hsingularity.net>,
        Aubrey Li <aubrey.intel@...il.com>,
        Julien Desfossez <jdesfossez@...italocean.com>,
        Vineeth Remanan Pillai <vpillai@...italocean.com>,
        Nishanth Aravamudan <naravamudan@...italocean.com>,
        Peter Zijlstra <peterz@...radead.org>,
        Tim Chen <tim.c.chen@...ux.intel.com>,
        Thomas Gleixner <tglx@...utronix.de>,
        Paul Turner <pjt@...gle.com>,
        Linus Torvalds <torvalds@...ux-foundation.org>,
        Linux List Kernel Mailing <linux-kernel@...r.kernel.org>,
        Subhra Mazumdar <subhra.mazumdar@...cle.com>,
        Fr?d?ric Weisbecker <fweisbec@...il.com>,
        Kees Cook <keescook@...omium.org>,
        Greg Kerr <kerrnel@...gle.com>, Aaron Lu <aaron.lwe@...il.com>,
        Valentin Schneider <valentin.schneider@....com>,
        Pawan Gupta <pawan.kumar.gupta@...ux.intel.com>,
        Paolo Bonzini <pbonzini@...hat.com>,
        Jiri Kosina <jkosina@...e.cz>
Subject: Re: [RFC PATCH v2 00/17] Core scheduling v2

On Thu, Apr 25, 2019 at 08:53:43PM +0200 Ingo Molnar wrote:
> Interesting. This strongly suggests sub-optimal SMT-scheduling in the 
> non-saturated HT case, i.e. a scheduler balancing bug.
> 
> As long as loads are clearly below the physical cores count (which they 
> are in the early phases of your table) the scheduler should spread tasks 
> without overlapping two tasks on the same core.
> 
> Clearly it doesn't.
> 

That's especially true if there are cgroups with different numbers of
tasks in them involved. 

Here's an example showing the average number of tasks on each of the 4 numa
nodes during a test run. 20 cpus per node. There are 78 threads total, 76
for lu and 2 stress cpu hogs. So fewer than the 80 CPUs on the box. The GROUP
test has the two stresses and lu in distinct cgroups. The NORMAL test has them
all in one. This is from 5.0-rc3+, but the version doesn't matter. It's
reproducible on any kernel. SMT is on, but that also doesn't matter here.

The first two lines show where the stress jobs ran and the second show where
the 76 threads of lu ran.

GROUP_1.stress.ps.numa.hist      Average    1.00   1.00
NORMAL_1.stress.ps.numa.hist     Average    0.00   1.10   0.90

lu.C.x_76_GROUP_1.ps.numa.hist   Average    10.97  11.78  26.28  26.97
lu.C.x_76_NORMAL_1.ps.numa.hist  Average    19.70  18.70  17.80  19.80

The NORMAL test is evenly balanced across the 20 cpus per numa node.  There
is between a 4x and 10x performance hit to the lu benchmark between group
and normal in any of these test runs. In this particular case it was 10x:

============76_GROUP========Mop/s===================================
min     q1      median  q3      max
3776.51 3776.51 3776.51 3776.51 3776.51
============76_GROUP========time====================================
min     q1      median  q3      max
539.92  539.92  539.92  539.92  539.92
============76_NORMAL========Mop/s===================================
min     q1      median  q3      max
39386   39386   39386   39386   39386
============76_NORMAL========time====================================
min     q1      median  q3      max
51.77   51.77   51.77   51.77   51.77


This a bit off topic, but since balancing bugs was mentioned and I've been
trying to track this down for a while (and learning the scheduler code in
the process) I figured I'd just throw it out there :)


Cheers,
Phil

-- 

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ