[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20170705112232.tkeqnwh2urlc6nbx@hirez.programming.kicks-ass.net>
Date: Wed, 5 Jul 2017 13:22:32 +0200
From: Peter Zijlstra <peterz@...radead.org>
To: Jeffrey Hugo <jhugo@...eaurora.org>
Cc: Ingo Molnar <mingo@...hat.com>, linux-kernel@...r.kernel.org,
Dietmar Eggemann <dietmar.eggemann@....com>,
Austin Christ <austinwc@...eaurora.org>,
Tyler Baicar <tbaicar@...eaurora.org>,
Timur Tabi <timur@...eaurora.org>
Subject: Re: [PATCH V5 1/2] sched/fair: Fix load_balance() affinity redo path
On Wed, Jun 07, 2017 at 01:18:57PM -0600, Jeffrey Hugo wrote:
> If load_balance() fails to migrate any tasks because all tasks were
> affined, load_balance() removes the source cpu from consideration and
> attempts to redo and balance among the new subset of cpus.
>
> There is a bug in this code path where the algorithm considers all active
> cpus in the system (minus the source that was just masked out). This is
> not valid for two reasons: some active cpus may not be in the current
> scheduling domain and one of the active cpus is dst_cpu. These cpus should
> not be considered, as we cannot pull load from them.
>
> Instead of failing out of load_balance(), we may end up redoing the search
> with no valid cpus and incorrectly concluding the domain is balanced.
> Additionally, if the group_imbalance flag was just set, it may also be
> incorrectly unset, thus the flag will not be seen by other cpus in future
> load_balance() runs as that algorithm intends.
>
> Fix the check by removing cpus not in the current domain and the dst_cpu
> from considertation, thus limiting the evaluation to valid remaining cpus
> from which load might be migrated.
>
> Co-authored-by: Austin Christ <austinwc@...eaurora.org>
> Co-authored-by: Dietmar Eggemann <dietmar.eggemann@....com>
> Signed-off-by: Jeffrey Hugo <jhugo@...eaurora.org>
> Tested-by: Tyler Baicar <tbaicar@...eaurora.org>
Yes, this looks good. Thanks!
Powered by blists - more mailing lists