lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  PHC 
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Wed, 16 Sep 2020 16:53:09 -0400
From:   chris hyser <>
To:     "Li, Aubrey" <>,
        Julien Desfossez <>,
        Peter Zijlstra <>,
        Vineeth Pillai <>,
        Joel Fernandes <>,
        Tim Chen <>,
        Aaron Lu <>,
        Aubrey Li <>,
        Dhaval Giani <>,
        Nishanth Aravamudan <>
        Phil Auld <>,
        Valentin Schneider <>,
        Mel Gorman <>,
        Pawan Gupta <>,
        Paolo Bonzini <>,,, Chen Yu <>,
        Christian Brauner <>,
        Agata Gruza <>,
        Antonio Gomez Iglesias <>,,,,,,,
        Aaron Lu <>,
        "Ning, Hongyu" <>
Subject: Re: [RFC PATCH v7 11/23] sched/fair: core wide cfs task priority

On 9/16/20 10:24 AM, chris hyser wrote:
> On 9/16/20 8:57 AM, Li, Aubrey wrote:
>>> Here are the uperf results of the various patchsets. Note, that disabling smt is better for these tests and that that 
>>> presumably reflects the overall overhead of core scheduling which went from bad to really bad. The primary focus in 
>>> this email is to start to understand what happened within core sched itself.
>>> patchset          smt=on/cs=off  smt=off    smt=on/cs=on
>>> --------------------------------------------------------
>>> v5-v5.6.y      :    1.78Gb/s     1.57Gb/s     1.07Gb/s
>>> pre-v6-v5.6.y  :    1.75Gb/s     1.55Gb/s    822.16Mb/s
>>> v6-5.7         :    1.87Gs/s     1.56Gb/s    561.6Mb/s
>>> v6-5.7-hotplug :    1.75Gb/s     1.58Gb/s    438.21Mb/s
>>> v7             :    1.80Gb/s     1.61Gb/s    440.44Mb/s
>> I haven't had a chance to play with v7, but I got something different.
>>    branch        smt=on/cs=on
>> coresched/v5-v5.6.y    1.09Gb/s
>> coresched/v6-v5.7.y    1.05Gb/s
>> I attached my kernel config in case you want to make a comparison, or you
>> can send yours, I'll try to see I can replicate your result.
> I will give this config a try. One of the reports forwarded to me about the drop in uperf perf was an email from you I 
> believe mentioning a 50% perf drop between v5 and v6?? I was actually setting out to duplicate your results. :-)

The first thing I did was to verify I built and tested the right bits. Presumably as I get same numbers. I'm still 
trying to tweak your config to get a root disk in my setup. Oh, one thing I missed in reading your first response, I had 
24 cores/48 cpus. I think you had half that, though my guess is that that should have actually made the numbers even 
worse. :-)

The following was forwarded to me originally sent on Aug 3, by you I believe:

> We found uperf(in cgroup) throughput drops by ~50% with corescheduling.
> The problem is, uperf triggered a lot of softirq and offloaded softirq
> service to *ksoftirqd* thread.
> - default, ksoftirqd thread can run with uperf on the same core, we saw
>   100% CPU utilization.
> - coresched enabled, ksoftirqd's core cookie is different from uperf, so
>   they can't run concurrently on the same core, we saw ~15% forced idle.
> I guess this kind of performance drop can be replicated by other similar
> (a lot of softirq activities) workloads.
> Currently core scheduler picks cookie-match tasks for all SMT siblings, does
> it make sense we add a policy to allow cookie-compatible task running together?
> For example, if a task is trusted(set by admin), it can work with kernel thread.
> The difference from corescheduling disabled is that we still have user to user
> isolation.
> Thanks,
> -Aubrey

Would you please elaborate on what this test was? In trying to duplicate this, I just kept adding uperf threads to my 
setup until I started to see performance losses similar to what is reported above (and a second report about v7). Also, 
I wasn't looking for absolute numbers per-se, just significant enough differences to try to track where the performance 


Powered by blists - more mailing lists