lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAKfTPtBRkWY1hTBqd9cQKetFbWEVvJVo-wKUz9MN-ZfZLq4TuA@mail.gmail.com>
Date:	Tue, 3 Jun 2014 14:31:38 +0200
From:	Vincent Guittot <vincent.guittot@...aro.org>
To:	Peter Zijlstra <peterz@...radead.org>
Cc:	Ingo Molnar <mingo@...nel.org>,
	linux-kernel <linux-kernel@...r.kernel.org>,
	Russell King - ARM Linux <linux@....linux.org.uk>,
	LAK <linux-arm-kernel@...ts.infradead.org>,
	Preeti U Murthy <preeti@...ux.vnet.ibm.com>,
	Morten Rasmussen <Morten.Rasmussen@....com>,
	Mike Galbraith <efault@....de>,
	Nicolas Pitre <nicolas.pitre@...aro.org>,
	"linaro-kernel@...ts.linaro.org" <linaro-kernel@...ts.linaro.org>,
	Daniel Lezcano <daniel.lezcano@...aro.org>
Subject: Re: [PATCH v2 10/11] sched: move cfs task on a CPU with higher capacity

On 3 June 2014 13:15, Peter Zijlstra <peterz@...radead.org> wrote:
> On Mon, Jun 02, 2014 at 07:06:44PM +0200, Vincent Guittot wrote:
>> > Could you detail those conditions? FWIW those make excellent Changelog
>> > material.
>>
>> I have looked back into my tests and traces:
>>
>> In a 1st test, the capacity of the CPU was still above half default
>> value (power=538) unlike what i remembered. So it's some what "normal"
>> to keep the task on CPU0 which also handles IRQ because sg_capacity
>> still returns 1.
>
> OK, so I suspect that once we move to utilization based capacity stuff
> we'll do the migration IF the task indeed requires more cpu than can be
> provided by the reduced, one, right?

The current version of the patchset only checks if the capacity of a
CPU has significantly reduced that we should look for another CPU. But
we effectively could also add compare the remaining capacity with the
task load

>
>> In a 2nd test,the main task runs (most of the time) on CPU0 whereas
>> the max power of the latter is only 623 and the cpu_power goes below
>> 512 (power=330) during the use case. So the sg_capacity of CPU0 is
>> null but the main task still stays on CPU0.
>> The use case (scp transfer) is made of a long running task (ssh) and a
>> periodic short task (scp). ssh runs on CPU0 and scp runs each 6ms on
>> CPU1. The newly idle load balance on CPU1 doesn't pull the long
>> running task although sg_capacity is null because of
>> sd->nr_balance_failed is never incremented and load_balance doesn't
>> trig an active load_balance. When an idle balance occurs in the middle
>> of the newly idle balance, the ssh long task migrates on CPU1 but as
>> soon as it sleeps and wakes up, it goes back on CPU0 because of the
>> wake affine which migrates it back on CPU0 (issue solved by patch 09).
>
> OK, so there's two problems here, right?
>  1) we don't migrate away from cpu0
>  2) if we do, we get pulled back.
>
> And patch 9 solves 2, so maybe enhance its changelog to mention this
> slightly more explicit.
>
> Which leaves us with 1.. interesting problem. I'm just not sure
> endlessly kicking a low capacity cpu is the right fix for that.

What prevent us to migrate the task directly is the fact that
nr_balance_failed is not incremented for newly idle and it's the only
condition for active migration (except asym feature)

We could add a additional test in need_active_balance for newly_idle
load balance. Something like:

if ((sd->flags & SD_SHARE_PKG_RESOURCES)
 && (senv->rc_rq->cpu_power_orig * 100) > (env->src_rq->group_power *
env->sd->imbalance_pct))
return 1;

>
>
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ