linux-kernel - Re: [RESEND RFC PATCH V3] sched: Improve scalability of select_idle

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <930364e4-bbfe-8c8f-d095-0dd4256a5104@oracle.com>
Date:   Mon, 5 Feb 2018 14:09:11 -0800
From:   Subhra Mazumdar <subhra.mazumdar@...cle.com>
To:     Peter Zijlstra <peterz@...radead.org>
Cc:     Steven Sistare <steven.sistare@...cle.com>,
        linux-kernel@...r.kernel.org, mingo@...hat.com,
        dhaval.giani@...cle.com
Subject: Re: [RESEND RFC PATCH V3] sched: Improve scalability of
 select_idle_sibling using SMT balance



On 02/05/2018 04:19 AM, Peter Zijlstra wrote:
> On Fri, Feb 02, 2018 at 09:37:02AM -0800, Subhra Mazumdar wrote:
>> In the scheme of SMT balance, if the idle cpu search is done _not_ in the
>> last run core, then we need a random cpu to start from. If the idle cpu
>> search is done in the last run core we can start the search from last run
>> cpu. Since we need the random index for the first case I just did it for
>> both.
> That shouldn't be too hard to fix. I think we can simply transpose the
> CPU number. That is, something like:
>
>    cpu' = core'_id + (cpu - core_id)
>
> should work for most sane cases. We don't give any guarantees this will
> in fact work, but (almost) all actual CPU enumeration schemes I've seen
> this should work for.
>
> And if it doesn't work, we're not worse of than we are now.
>
> I just couldn't readily find a place where we need to do this for cores
> with the current code. But I think we have one place between LLCs where
> it can be done:
>
> ---
> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> index 7b6535987500..eb8b8d0a026c 100644
> --- a/kernel/sched/fair.c
> +++ b/kernel/sched/fair.c
> @@ -6109,7 +6109,7 @@ static int select_idle_smt(struct task_struct *p, struct sched_domain *sd, int t
>   	if (!static_branch_likely(&sched_smt_present))
>   		return -1;
>   
> -	for_each_cpu(cpu, cpu_smt_mask(target)) {
> +	for_each_cpu_wrap(cpu, cpu_smt_mask(target), target) {
>   		if (!cpumask_test_cpu(cpu, &p->cpus_allowed))
>   			continue;
>   		if (idle_cpu(cpu))
> @@ -6357,8 +6357,17 @@ select_task_rq_fair(struct task_struct *p, int prev_cpu, int sd_flag, int wake_f
>   		if (cpu == prev_cpu)
>   			goto pick_cpu;
>   
> -		if (wake_affine(affine_sd, p, prev_cpu, sync))
> -			new_cpu = cpu;
> +		if (wake_affine(affine_sd, p, prev_cpu, sync)) {
> +			/*
> +			 * Transpose prev_cpu's offset into this cpu's
> +			 * LLC domain to retain the 'random' search offset
> +			 * for for_each_cpu_wrap().
> +			 */
> +			new_cpu = per_cpu(sd_llc_id, cpu) +
> +				  (prev_cpu - per_cpu(sd_llc_id, prev_cpu));
> +			if (unlikely(!cpus_share_cache(new_cpu, cpu)))
> +				new_cpu = cpu;
> +		}
>   	}
>   
>   	if (sd && !(sd_flag & SD_BALANCE_FORK)) {
The pseudo random is also used for choosing a random core to compare 
with, how will transposing achieve that?

Thanks,
Subhra