lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Fri, 15 Sep 2017 06:06:21 +0200
From:   Mike Galbraith <efault@....de>
To:     Rik van Riel <riel@...hat.com>, Joel Fernandes <joelaf@...gle.com>
Cc:     kernel test robot <xiaolong.ye@...el.com>,
        LKML <linux-kernel@...r.kernel.org>,
        Peter Zijlstra <peterz@...radead.org>,
        Josef Bacik <jbacik@...com>, Juri Lelli <Juri.Lelli@....com>,
        Brendan Jackman <brendan.jackman@....com>,
        Dietmar Eggemann <dietmar.eggemann@....com>,
        Matt Fleming <matt@...eblueprint.co.uk>,
        Ingo Molnar <mingo@...hat.com>, lkp@...org
Subject: Re: [lkp-robot] [sched/fair] 6d46bd3d97: netperf.Throughput_tps
 -11.3% regression

On Thu, 2017-09-14 at 11:56 -0400, Rik van Riel wrote:
> 
> On systems with SMT, it may make more sense for
> sync wakeups to look for idle threads of the same
> core, than to have the woken task end up on the 
> same thread, and wait for the current task to stop
> running.

Depends.

homer:/root # taskset -c 3 pipe-test
1.412185 usecs/loop -- avg 1.412185 1416.2 KHz
homer:/root # taskset -c 2,3 pipe-test
2.298820 usecs/loop -- avg 2.298820 870.0 KHz
homer:/root # taskset -c 3,7 pipe-test
1.899164 usecs/loop -- avg 1.899164 1053.1 KHz

For pipe-test, having ~zero overlap as well as ~zero footprint, that's
a good choice, but..

homer:/root # taskset -c 3 tbench.sh 1 10 2>&1|grep Throughput
Throughput 844.04 MB/sec  1 clients  1 procs  max_latency=0.042 ms
homer:/root # taskset -c 2,3 tbench.sh 1 10 2>&1|grep Throughput
Throughput 713.25 MB/sec  1 clients  1 procs  max_latency=0.324 ms
homer:/root # taskset -c 3,7 tbench.sh 1 10 2>&1|grep Throughput
Throughput 512.866 MB/sec  1 clients  1 procs  max_latency=0.454 ms

..for tbench, where my crusty ole Q6600 turns in a win by scheduling
the pair on separate L2 sharing cores, for the more modern SMT equipped
i4790, targeting shared L2 is the worst choice.

Bigger issue is that while microbenchmark behavior is consistant,
applications tend to process data and react to it (vs merely batting it
about like playful kittens, cute, but not all that productive), likely
mucking up any heuristic anyone invents with depressing regularity.

	-Mike

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ