lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <1280749178.1923.114.camel@laptop>
Date:	Mon, 02 Aug 2010 13:39:38 +0200
From:	Peter Zijlstra <peterz@...radead.org>
To:	Nikhil Rao <ncrao@...gle.com>
Cc:	Ingo Molnar <mingo@...e.hu>, Mike Galbraith <efault@....de>,
	linux-kernel@...r.kernel.org,
	Venkatesh Pallipadi <venki@...gle.com>,
	Ken Chen <kenchen@...gle.com>, Paul Turner <pjt@...gle.com>
Subject: Re: [PATCH 0/6] [RFC] Large weight differential leads to
 inefficient load balancing

On Thu, 2010-07-29 at 22:19 -0700, Nikhil Rao wrote:
> If folks think this is a good direction and this idea has some merit, then I
> will be more than happy to continue working with the community to improve this
> patchset. If there is another way to fix this problem, then please do tell; I
> would be more than happy to help out.


No its terrible. I've already explained how to solve this on several
occasions (although my google skillz seem to fail me to find the latest
occasion a few months ago).

The thing is that from a fairness point of view 1 nice-0 (weight=1024)
on one CPU and 512 SCHED_IDLE (weight=2) tasks on another CPU and all
other CPUs idle is correct.

It just doesn't seem to be the thing that most people expect.

Special casing things like you've done is utterly the wrong thing to do.

This problem comes in two forms and its name is infeasible weight
distribution. The load-balancer tries to ensure W_k ~= W_l, k,l elem_of
CPUs, where W_k = \Sum_i w_i^k, where w_i^k is the i-th task on CPU k.

The two cases are statically infeasible, and dynamically infeasible. 

We say the task-set is statically infeasible if for a task set of n
tasks there is no way to statically distribute them on N <= n CPUs such
that each task gets equal service (assuming the scheduling on each CPU
is fair).

We say the task-set is dynamically infeasible if for the given scenario
there is no way to rotate the tasks to obtain equality.

Lets assume 2 CPUs.

Ex.1: 2 tasks of different weight.

Ex.2: 3 tasks of equal weight.

The first example is both statically and dynamically infeasible as there
is no way to occupy both CPUs such that each task gets the proportional
correct service.

The second example is statically infeasible, but dynamically feasible,
for if we rotate one task, such that we alternate between 2:1 and 1:2 in
equal measures, each task will receive its correct 2/3rd CPU service.

The current load-balancer isn't particularly skilled at either issue.

The proper solution is to 'fix' find_busiest_group() so that it will:
 - pick the heaviest cpu with more than 1 task on it
 - slowly over-balance things

The first thing will solve your issue.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ