lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <4118c2e3-fd34-2ebe-3faa-1c6ac9cbbac2@linux.vnet.ibm.com>
Date:   Mon, 7 Aug 2023 15:06:02 +0530
From:   Shrikanth Hegde <sshegde@...ux.vnet.ibm.com>
To:     Tim Chen <tim.c.chen@...ux.intel.com>, peterz@...radead.org
Cc:     bristot@...hat.com, bsegall@...gle.com, dietmar.eggemann@....com,
        hdanton@...a.com, ionela.voinescu@....com, juri.lelli@...hat.com,
        len.brown@...el.com, linux-kernel@...r.kernel.org, mgorman@...e.de,
        naveen.n.rao@...ux.vnet.ibm.com, rafael.j.wysocki@...el.com,
        ravi.v.shankar@...el.com, ricardo.neri@...el.com,
        rostedt@...dmis.org, srikar@...ux.vnet.ibm.com,
        srinivas.pandruvada@...ux.intel.com, v-songbaohua@...o.com,
        vincent.guittot@...aro.org, vschneid@...hat.com, x86@...nel.org,
        yangyicong@...ilicon.com, yu.c.chen@...el.com
Subject: Re: [PATCH] sched/fair: Add SMT4 group_smt_balance handling



On 7/27/23 7:02 PM, Tim Chen wrote:
> On Wed, 2023-07-26 at 20:11 -0700, Tim Chen wrote:
>> On Mon, 2023-07-17 at 20:28 +0530, Shrikanth Hegde wrote:
>>> From: Tim Chen <tim.c.chen@...ux.intel.com>
>>>
>>> For SMT4, any group with more than 2 tasks will be marked as
>>> group_smt_balance. Retain the behaviour of group_has_spare by marking
>>> the busiest group as the group which has the least number of idle_cpus.
>>>
>>> Also, handle rounding effect of adding (ncores_local + ncores_busy)
>>> when the local is fully idle and busy group has more than 2 tasks.
>>> Local group should try to pull at least 1 task in this case.
>>>
>>> Originally-by: Tim Chen <tim.c.chen@...ux.intel.com>
>>> Signed-off-by: Tim Chen <tim.c.chen@...ux.intel.com>
>>> Signed-off-by: Shrikanth Hegde <sshegde@...ux.vnet.ibm.com>
>>> ---
>>>  kernel/sched/fair.c | 13 ++++++++++++-
>>>  1 file changed, 12 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
>>> index 932e7b78894a..9502013abe33 100644
>>> --- a/kernel/sched/fair.c
>>> +++ b/kernel/sched/fair.c
>>> @@ -9532,7 +9532,7 @@ static inline long sibling_imbalance(struct lb_env *env,
>>>  	imbalance /= ncores_local + ncores_busiest;
>>>
>>>  	/* Take advantage of resource in an empty sched group */
>>> -	if (imbalance == 0 && local->sum_nr_running == 0 &&
>>> +	if (imbalance <= 1 && local->sum_nr_running == 0 &&
>>>  	    busiest->sum_nr_running > 1)
>>>  		imbalance = 2;
>>>
>>> @@ -9720,6 +9720,17 @@ static bool update_sd_pick_busiest(struct lb_env *env,
>>>  		break;
>>>
>>>  	case group_smt_balance:
>>> +		/* no idle cpus on both groups handled by group_fully_busy below */
>>> +		if (sgs->idle_cpus != 0 || busiest->idle_cpus != 0) {
>>> +			if (sgs->idle_cpus > busiest->idle_cpus)
>>> +				return false;
>>> +			if (sgs->idle_cpus < busiest->idle_cpus)
>>> +				return true;
>>> +			if (sgs->sum_nr_running <= busiest->sum_nr_running)
>>> +				return false;
>>> +		}
>>> +		break;
> 
> Shrikanth and Peter,
> 
> Sorry, I acked Shrikanth's fixup patch too quickly without seeing that Shrikanth added
> a "break" in the patch above.  My original code did not have that break statement as
> I did intend the code to fall through to the "group_fully_busy" code path when
> there are no idle cpus in both groups.  To make the compiler happy and putting
> in the correct logic, I refresh the patch as below.
> 
> Thanks.
> 
> Tim
> 
> From: Tim Chen <tim.c.chen@...ux.intel.com>
> Date: Fri, 14 Jul 2023 16:09:30 -0700
> Subject: [PATCH] sched/fair: Add SMT4 group_smt_balance handling
> 
> For SMT4, any group with more than 2 tasks will be marked as
> group_smt_balance. Retain the behaviour of group_has_spare by marking
> the busiest group as the group which has the least number of idle_cpus.
> 
> Also, handle rounding effect of adding (ncores_local + ncores_busy)
> when the local is fully idle and busy group has more than 2 tasks.
> Local group should try to pull at least 1 task in this case.
> 
> Signed-off-by: Tim Chen <tim.c.chen@...ux.intel.com>
> ---
>  kernel/sched/fair.c | 18 ++++++++++++++++--
>  1 file changed, 16 insertions(+), 2 deletions(-)
> 
> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> index a87988327f88..566686c5f2bd 100644
> --- a/kernel/sched/fair.c
> +++ b/kernel/sched/fair.c
> @@ -9563,7 +9563,7 @@ static inline long sibling_imbalance(struct lb_env *env,
>  	imbalance /= ncores_local + ncores_busiest;
>  
>  	/* Take advantage of resource in an empty sched group */
> -	if (imbalance == 0 && local->sum_nr_running == 0 &&
> +	if (imbalance <= 1 && local->sum_nr_running == 0 &&
>  	    busiest->sum_nr_running > 1)
>  		imbalance = 2;
>  
> @@ -9751,6 +9751,20 @@ static bool update_sd_pick_busiest(struct lb_env *env,
>  		break;
>  
>  	case group_smt_balance:
> +		/* no idle cpus on both groups handled by group_fully_busy below */
> +		if (sgs->idle_cpus != 0 || busiest->idle_cpus != 0) {
> +			if (sgs->idle_cpus > busiest->idle_cpus)
> +				return false;
> +			if (sgs->idle_cpus < busiest->idle_cpus)
> +				return true;
> +			if (sgs->sum_nr_running <= busiest->sum_nr_running)
> +				return false;
> +			else
> +				return true;
> +		}
> +		goto fully_busy;
> +		break;
> +
>  	case group_fully_busy:
>  		/*
>  		 * Select the fully busy group with highest avg_load. In
> @@ -9763,7 +9777,7 @@ static bool update_sd_pick_busiest(struct lb_env *env,
>  		 * select the 1st one, except if @sg is composed of SMT
>  		 * siblings.
>  		 */
> -
> +fully_busy:
>  		if (sgs->avg_load < busiest->avg_load)
>  			return false;
>  

Hi Tim, Peter. 

group_smt_balance(cluster scheduling), patches are in tip/sched/core. I dont 
see this above patch there yet. Currently as is, this can cause function difference 
in SMT4 systems( such as Power10). 

Can we please have the above patch as well in tip/sched/core?

Acked-by: Shrikanth Hegde <sshegde@...ux.vnet.ibm.com>

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ