[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <0c81e6f1-b017-89fb-35a8-65c9b3f96a1c@linux.intel.com>
Date: Mon, 14 Dec 2020 15:53:14 +0800
From: "Li, Aubrey" <aubrey.li@...ux.intel.com>
To: Mel Gorman <mgorman@...hsingularity.net>
Cc: mingo@...hat.com, peterz@...radead.org, juri.lelli@...hat.com,
vincent.guittot@...aro.org, valentin.schneider@....com,
qais.yousef@....com, dietmar.eggemann@....com, rostedt@...dmis.org,
bsegall@...gle.com, tim.c.chen@...ux.intel.com,
linux-kernel@...r.kernel.org, Mel Gorman <mgorman@...e.de>,
Jiang Biao <benbjiang@...il.com>
Subject: Re: [RFC PATCH v7] sched/fair: select idle cpu from idle cpumask for
task wakeup
On 2020/12/10 19:34, Mel Gorman wrote:
> On Thu, Dec 10, 2020 at 04:23:47PM +0800, Li, Aubrey wrote:
>>> I ran this patch with tbench on top of of the schedstat patches that
>>> track SIS efficiency. The tracking adds overhead so it's not a perfect
>>> performance comparison but the expectation would be that the patch reduces
>>> the number of runqueues that are scanned
>>
>> Thanks for the measurement! I don't play with tbench so may need a while
>> to digest the data.
>>
>
> They key point is that it appears the idle mask was mostly equivalent to
> the full domain mask, at least for this test.
>
>>>
>>> tbench4
>>> 5.10.0-rc6 5.10.0-rc6
>>> schedstat-v1r1 idlemask-v7r1
>>> Hmean 1 504.76 ( 0.00%) 500.14 * -0.91%*
>>> Hmean 2 1001.22 ( 0.00%) 970.37 * -3.08%*
>>> Hmean 4 1930.56 ( 0.00%) 1880.96 * -2.57%*
>>> Hmean 8 3688.05 ( 0.00%) 3537.72 * -4.08%*
>>> Hmean 16 6352.71 ( 0.00%) 6439.53 * 1.37%*
>>> Hmean 32 10066.37 ( 0.00%) 10124.65 * 0.58%*
>>> Hmean 64 12846.32 ( 0.00%) 11627.27 * -9.49%*
I focused on this case and run it 5 times, and here is the data on my side.
5 times x 600s tbench, thread number is 153(80% x 192(h/w thread num)).
Hmean 153 v5.9.12 v5.9.12
schedstat-v1 idlemask-v8(with schedstat)
Round 1 15717.3 15608.1
Round 2 14856.9 15642.5
Round 3 14856.7 15782.1
Round 4 15408.9 15912.9
Round 5 15436.6 15927.7
>From tbench throughput data, bigger is better, it looks like idlemask wins
And here is SIS_scanned data:
Hmean 153 v5.9.12 v5.9.12
schedstat-v1 idlemask-v8(with schedstat)
Round 1 22562490432 21894932302
Round 2 21288529957 21693722629
Round 3 20657521771 21268308377
Round 4 21868486414 22289128955
Round 5 21859614988 22214740417
>From SIS_scanned data, less is better, it looks like the default one is better.
But combined with throughput data, this can be explained as bigger throughput
performs more SIS_scanned.
So at least, there is no regression of this case.
Thanks,
-Aubrey
Powered by blists - more mailing lists