lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Date:	Mon, 4 Apr 2016 10:23:02 +0200
From:	Jiri Olsa <jolsa@...hat.com>
To:	Ingo Molnar <mingo@...nel.org>,
	Peter Zijlstra <a.p.zijlstra@...llo.nl>
Cc:	James Hartsock <hartsjc@...hat.com>,
	Rik van Riel <riel@...hat.com>,
	Srivatsa Vaddagiri <vatsa@...ux.vnet.ibm.com>,
	Kirill Tkhai <ktkhai@...allels.com>,
	linux-kernel@...r.kernel.org
Subject: [RFC] sched: unused cpu in affine workload

hi,
we've noticed following issue in one of our workloads.

I have 24 CPUs server with following sched domains:
  domain 0: (pairs)
  domain 1: 0-5,12-17 (group1)  6-11,18-23 (group2)
  domain 2: 0-23 level NUMA

I run CPU hogging workload on following CPUs:
  4,6,14,18,19,20,23

that is:
  4,14          CPUs from group1
  6,18,19,20,23 CPUs from group2

the workload process gets affinity setup via 'taskset -c ${CPUs workload ...'
and forks child for every CPU

very often we notice CPUs 4 and 14 running 3 processes of the workload
while CPUs 6,18,19,20,23 running just 4 processes, leaving one of the
CPU from group2 idle

AFAICS from the code the reason for this is that the load balancing
follows domains setup (topology) and does not regard affinity setups
like this. The code in find_busiest_group running under idle cpu from
group2 will find group1 as bussiest, but its average load will be
smaller than the one on the local group, so there's no task pulling.

It's obvious, that load balancer follows sched domain topology.
However is there some sched feature I'm missing that could help
with this? Or do we need to follow sched domains topology when
we select CPUs for workload to get even balancing?

thanks,
jirka

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ