linux-kernel - Re: hackbench vs select_idle_sibling; was: [tip:sched/core] sched/fair, cpumask: Export for_each_cpu

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20170517124635.GB3229@codeblueprint.co.uk>
Date:   Wed, 17 May 2017 13:46:35 +0100
From:   Matt Fleming <matt@...eblueprint.co.uk>
To:     Peter Zijlstra <peterz@...radead.org>
Cc:     mingo@...nel.org, tglx@...utronix.de, riel@...hat.com,
        hpa@...or.com, efault@....de, linux-kernel@...r.kernel.org,
        torvalds@...ux-foundation.org, lvenanci@...hat.com,
        xiaolong.ye@...el.com, kitsunyan@...ox.ru, clm@...com
Subject: Re: hackbench vs select_idle_sibling; was: [tip:sched/core]
 sched/fair, cpumask: Export for_each_cpu_wrap()

On Wed, 17 May, at 12:53:50PM, Peter Zijlstra wrote:
> On Mon, May 15, 2017 at 02:03:11AM -0700, tip-bot for Peter Zijlstra wrote:
> > sched/fair, cpumask: Export for_each_cpu_wrap()
> 
> > -static int cpumask_next_wrap(int n, const struct cpumask *mask, int start, int *wrapped)
> > -{
> 
> > -	next = find_next_bit(cpumask_bits(mask), nr_cpumask_bits, n+1);
> 
> > -}
> 
> OK, so this patch fixed an actual bug in the for_each_cpu_wrap()
> implementation. The above 'n+1' should be 'n', and the effect is that
> it'll skip over CPUs, potentially resulting in an iteration that only
> sees every other CPU (for a fully contiguous mask).
> 
> This in turn causes hackbench to further suffer from the regression
> introduced by commit:
> 
>   4c77b18cf8b7 ("sched/fair: Make select_idle_cpu() more aggressive")
> 
> So its well past time to fix this.
> 
> Where the old scheme was a cliff-edge throttle on idle scanning, this
> introduces a more gradual approach. Instead of stopping to scan
> entirely, we limit how many CPUs we scan.
> 
> Initial benchmarks show that it mostly recovers hackbench while not
> hurting anything else, except Mason's schbench, but not as bad as the
> old thing.
> 
> It also appears to recover the tbench high-end, which also suffered like
> hackbench.
> 
> I'm also hoping it will fix/preserve kitsunyan's interactivity issue.
> 
> Please test..

Tests queued up at SUSE. I'll let you know the results as soon as
they're ready -- it'll be at least a couple of days.