linux-kernel - Re: [RFC PATCH 3/3] sched: Implement shared wakequeue in CFS

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <c8419d9b-2b31-2190-3058-3625bdbcb13d@meta.com>
Date:   Thu, 22 Jun 2023 11:57:48 -0400
From:   Chris Mason <clm@...a.com>
To:     Aaron Lu <aaron.lu@...el.com>, David Vernet <void@...ifault.com>
Cc:     Peter Zijlstra <peterz@...radead.org>,
        linux-kernel@...r.kernel.org, mingo@...hat.com,
        juri.lelli@...hat.com, vincent.guittot@...aro.org,
        rostedt@...dmis.org, dietmar.eggemann@....com, bsegall@...gle.com,
        mgorman@...e.de, bristot@...hat.com, vschneid@...hat.com,
        joshdon@...gle.com, roman.gushchin@...ux.dev, tj@...nel.org,
        kernel-team@...a.com
Subject: Re: [RFC PATCH 3/3] sched: Implement shared wakequeue in CFS

On 6/21/23 2:03 AM, Aaron Lu wrote:
> On Wed, Jun 21, 2023 at 12:43:52AM -0500, David Vernet wrote:
>> On Wed, Jun 21, 2023 at 12:54:16PM +0800, Aaron Lu wrote:
>>> On Tue, Jun 20, 2023 at 09:43:00PM -0500, David Vernet wrote:
>>>> On Wed, Jun 21, 2023 at 10:35:34AM +0800, Aaron Lu wrote:
>>>>> On Tue, Jun 20, 2023 at 12:36:26PM -0500, David Vernet wrote:

[ ... ]

>>>> I'm not sure what we're hoping to gain by continuing to run various
>>>> netperf workloads with your specific parameters?
>>>
>>> I don't quite follow you.
>>>
>>> I thought we were in the process of figuring out why for the same
>>> workload(netperf/default_mode/nr_client=nr_cpu) on two similar
>>> machines(both are Skylake) you saw no contention while I saw some so I
>>> tried to be exact on how I run the workload.
>>
>> I just reran the workload on a 26 core / 52 thread Cooper Lake using
>> your exact command below and still don't observe any contention
>> whatsoever on the swqueue lock:
> 
> Well, it's a puzzle to me.
> 
> But as you said below, I guess I'll just move on.

Thanks for bringing this up Aaron.  The discussion moved on to different
ways to fix the netperf triggered contention, but I wanted to toss this
out as an easy way to see the same problem:

# swqueue disabled:
# ./schbench -L -m 52 -p 512 -r 10 -t 1
Wakeup Latencies percentiles (usec) runtime 10 (s) (14674354 total samples)
          20.0th: 8          (4508866 samples)
          50.0th: 11         (2879648 samples)
          90.0th: 35         (5865268 samples)
        * 99.0th: 70         (1282166 samples)
          99.9th: 110        (124040 samples)
          min=1, max=9312
avg worker transfer: 28211.91 ops/sec 13.78MB/s

During the swqueue=0 run,  the system was ~30% idle

# swqueue enabled:
# ./schbench -L -m 52 -p 512 -r 10 -t 1
Wakeup Latencies percentiles (usec) runtime 10 (s) (6448414 total samples)
          20.0th: 30         (1383423 samples)
          50.0th: 39         (1986827 samples)
          90.0th: 63         (2446965 samples)
        * 99.0th: 108        (567275 samples)
          99.9th: 158        (57487 samples)
          min=1, max=15018
avg worker transfer: 12395.27 ops/sec 6.05MB/s

During the swqueue=1 run, the CPU was at was 97% system time, all stuck
on spinlock contention in the scheduler.

This is a single socket cooperlake with 26 cores/52 threads.

The work is similar to perf pipe test, 52 messenger threads each bouncing
a message back and forth with their own private worker for a 10 second run.

Adding more messenger threads (-m 128) increases the swqueue=0 ops/sec
to about 19MB/s and drags down the swqueue=1 ops/sec to 2MB/s.

-chris