linux-kernel - Re: [RFC PATCH v7] sched/fair: select idle cpu from idle cpumask for task wakeup

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite for Android: free password hash cracker in your pocket

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <0c81e6f1-b017-89fb-35a8-65c9b3f96a1c@linux.intel.com>
Date:   Mon, 14 Dec 2020 15:53:14 +0800
From:   "Li, Aubrey" <aubrey.li@...ux.intel.com>
To:     Mel Gorman <mgorman@...hsingularity.net>
Cc:     mingo@...hat.com, peterz@...radead.org, juri.lelli@...hat.com,
        vincent.guittot@...aro.org, valentin.schneider@....com,
        qais.yousef@....com, dietmar.eggemann@....com, rostedt@...dmis.org,
        bsegall@...gle.com, tim.c.chen@...ux.intel.com,
        linux-kernel@...r.kernel.org, Mel Gorman <mgorman@...e.de>,
        Jiang Biao <benbjiang@...il.com>
Subject: Re: [RFC PATCH v7] sched/fair: select idle cpu from idle cpumask for
 task wakeup

On 2020/12/10 19:34, Mel Gorman wrote:
> On Thu, Dec 10, 2020 at 04:23:47PM +0800, Li, Aubrey wrote:
>>> I ran this patch with tbench on top of of the schedstat patches that
>>> track SIS efficiency. The tracking adds overhead so it's not a perfect
>>> performance comparison but the expectation would be that the patch reduces
>>> the number of runqueues that are scanned
>>
>> Thanks for the measurement! I don't play with tbench so may need a while
>> to digest the data.
>>
> 
> They key point is that it appears the idle mask was mostly equivalent to
> the full domain mask, at least for this test.
> 
>>>
>>> tbench4
>>>                           5.10.0-rc6             5.10.0-rc6
>>>                       schedstat-v1r1          idlemask-v7r1
>>> Hmean     1        504.76 (   0.00%)      500.14 *  -0.91%*
>>> Hmean     2       1001.22 (   0.00%)      970.37 *  -3.08%*
>>> Hmean     4       1930.56 (   0.00%)     1880.96 *  -2.57%*
>>> Hmean     8       3688.05 (   0.00%)     3537.72 *  -4.08%*
>>> Hmean     16      6352.71 (   0.00%)     6439.53 *   1.37%*
>>> Hmean     32     10066.37 (   0.00%)    10124.65 *   0.58%*


>>> Hmean     64     12846.32 (   0.00%)    11627.27 *  -9.49%*

I focused on this case and run it 5 times, and here is the data on my side.
5 times x 600s tbench, thread number is 153(80% x 192(h/w thread num)).

Hmean 153		v5.9.12			v5.9.12
			schedstat-v1		idlemask-v8(with schedstat)
Round 1			15717.3			15608.1
Round 2			14856.9			15642.5
Round 3			14856.7			15782.1
Round 4			15408.9			15912.9
Round 5			15436.6			15927.7

>From tbench throughput data, bigger is better, it looks like idlemask wins

And here is SIS_scanned data:

Hmean 153		v5.9.12			v5.9.12
			schedstat-v1		idlemask-v8(with schedstat)
Round 1			22562490432		21894932302
Round 2			21288529957		21693722629
Round 3			20657521771		21268308377
Round 4			21868486414		22289128955
Round 5			21859614988		22214740417

>From SIS_scanned data, less is better, it looks like the default one is better.

But combined with throughput data, this can be explained as bigger throughput
performs more SIS_scanned.

So at least, there is no regression of this case.

Thanks,
-Aubrey