[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <32b98350-35e4-7475-2d19-9101f50ecc63@linux.intel.com>
Date: Wed, 19 May 2021 17:43:55 +0800
From: Aubrey Li <aubrey.li@...ux.intel.com>
To: Srikar Dronamraju <srikar@...ux.vnet.ibm.com>
Cc: Ingo Molnar <mingo@...nel.org>,
Peter Zijlstra <peterz@...radead.org>,
LKML <linux-kernel@...r.kernel.org>,
Mel Gorman <mgorman@...hsingularity.net>,
Rik van Riel <riel@...riel.com>,
Thomas Gleixner <tglx@...utronix.de>,
Valentin Schneider <valentin.schneider@....com>,
Vincent Guittot <vincent.guittot@...aro.org>,
Dietmar Eggemann <dietmar.eggemann@....com>,
Gautham R Shenoy <ego@...ux.vnet.ibm.com>,
Parth Shah <parth@...ux.ibm.com>
Subject: Re: [PATCH v2 6/8] sched/idle: Move busy_cpu accounting to idle
callback
On 5/18/21 3:18 PM, Srikar Dronamraju wrote:
>>>> This is v3. It looks like hackbench gets better. And netperf still has
>>>> some notable changes under 2 x overcommit cases.
>>>>
>>>
>>> Thanks Aubrey for the results. netperf (2X) case does seem to regress.
>>> I was actually expecting the results to get better with overcommit.
>>> Can you confirm if this was just v3 or with v3 + set_next_idle_core
>>> disabled?
>>
>> Do you mean set_idle_cores(not set_next_idle_core) actually? Gautham's patch
>> changed "this" to "target" in set_idle_cores, and I removed it to apply
>> v3-2-8-sched-fair-Maintain-the-identity-of-idle-core.patch for tip/sched/core
>> commit-id 915a2bc3c6b7.
>
> Thats correct,
>
> In the 3rd patch, I had introduced set_next_idle_core
> which is suppose to set idle_cores in the LLC.
> What I suspected was is this one is causing issues in your 48 CPU LLC.
>
> I am expecting set_next_idle_core to be spending much time in your scenario.
> I was planning for something like the below on top of my patch.
> With this we dont look for an idle-core if we already know that we dont find one.
> But in the mean while I had asked if you could have dropped the call to
> set_next_idle_core.
>
+ if (atomic_read(&sd->shared->nr_busy_cpus) * 2 >= per_cpu(sd_llc_size, target))
+ goto out;
Does this has side effect if waker and wakee are coalesced on a portion of cores?
Also, is 2 a SMT2 assumption?
I did a quick testing on this, it looks like the regression of netperf 2x cases are
improved indeed, but hackbench two mid-load cases get worse.
process-sockets group-2 1.00 ( 5.32) -18.40 ( 7.32)
threads-sockets group-2 1.00 ( 5.44) -20.44 ( 4.60)
Thanks,
-Aubrey
Powered by blists - more mailing lists