[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <32b98350-35e4-7475-2d19-9101f50ecc63@linux.intel.com>
Date:   Wed, 19 May 2021 17:43:55 +0800
From:   Aubrey Li <aubrey.li@...ux.intel.com>
To:     Srikar Dronamraju <srikar@...ux.vnet.ibm.com>
Cc:     Ingo Molnar <mingo@...nel.org>,
        Peter Zijlstra <peterz@...radead.org>,
        LKML <linux-kernel@...r.kernel.org>,
        Mel Gorman <mgorman@...hsingularity.net>,
        Rik van Riel <riel@...riel.com>,
        Thomas Gleixner <tglx@...utronix.de>,
        Valentin Schneider <valentin.schneider@....com>,
        Vincent Guittot <vincent.guittot@...aro.org>,
        Dietmar Eggemann <dietmar.eggemann@....com>,
        Gautham R Shenoy <ego@...ux.vnet.ibm.com>,
        Parth Shah <parth@...ux.ibm.com>
Subject: Re: [PATCH v2 6/8] sched/idle: Move busy_cpu accounting to idle
 callback
On 5/18/21 3:18 PM, Srikar Dronamraju wrote:
>>>> This is v3. It looks like hackbench gets better. And netperf still has
>>>> some notable changes under 2 x overcommit cases.
>>>>
>>>
>>> Thanks Aubrey for the results. netperf (2X) case does seem to regress.
>>> I was actually expecting the results to get better with overcommit.
>>> Can you confirm if this was just v3 or with v3 + set_next_idle_core
>>> disabled?
>>
>> Do you mean set_idle_cores(not set_next_idle_core) actually? Gautham's patch
>> changed "this" to "target" in set_idle_cores, and I removed it to apply
>> v3-2-8-sched-fair-Maintain-the-identity-of-idle-core.patch for tip/sched/core
>> commit-id 915a2bc3c6b7.
> 
> Thats correct,
> 
> In the 3rd patch, I had introduced set_next_idle_core
> which is suppose to set idle_cores in the LLC.
> What I suspected was is this one is causing issues in your 48 CPU LLC.
> 
> I am expecting set_next_idle_core to be spending much time in your scenario.
> I was planning for something like the below on top of my patch.
> With this we dont look for an idle-core if we already know that we dont find one.
> But in the mean while I had asked if you could have dropped the call to
> set_next_idle_core.
> 
+	if (atomic_read(&sd->shared->nr_busy_cpus) * 2 >=  per_cpu(sd_llc_size, target))
+		goto out;
Does this has side effect if waker and wakee are coalesced on a portion of cores?
Also, is 2 a SMT2 assumption?
I did a quick testing on this, it looks like the regression of netperf 2x cases are 
improved indeed, but hackbench two mid-load cases get worse.
process-sockets 	group-2 	 1.00 (  5.32)	-18.40 (  7.32)
threads-sockets 	group-2 	 1.00 (  5.44)	-20.44 (  4.60)
Thanks,
-Aubrey
Powered by blists - more mailing lists
 
