[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <5236BBF0.8030505@parallels.com>
Date: Mon, 16 Sep 2013 12:06:08 +0400
From: Vladimir Davydov <vdavydov@...allels.com>
To: Peter Zijlstra <peterz@...radead.org>
CC: Ingo Molnar <mingo@...nel.org>, Paul Turner <pjt@...gle.com>,
<linux-kernel@...r.kernel.org>, <devel@...nvz.org>
Subject: Re: [PATCH 1/2] sched: calculate_imbalance: Fix local->avg_load >
sds->avg_load case
On 09/16/2013 09:52 AM, Peter Zijlstra wrote:
> On Sun, Sep 15, 2013 at 05:49:13PM +0400, Vladimir Davydov wrote:
>> In busiest->group_imb case we can come to calculate_imbalance() with
>> local->avg_load >= busiest->avg_load >= sds->avg_load. This can result
>> in imbalance overflow, because it is calculated as follows
>>
>> env->imbalance = min(
>> max_pull * busiest->group_power,
>> (sds->avg_load - local->avg_load) * local->group_power
>> ) / SCHED_POWER_SCALE;
>>
>> As a result we can end up constantly bouncing tasks from one cpu to
>> another if there are pinned tasks.
>>
>> Fix this by skipping the assignment and assuming imbalance=0 in case
>> local->avg_load > sds->avg_load.
>> --
>> The bug can be caught by running 2*N cpuhogs pinned to two logical cpus
>> belonging to different cores on an HT-enabled machine with N logical
>> cpus: just look at se.nr_migrations growth.
>>
>> Signed-off-by: Vladimir Davydov<vdavydov@...allels.com>
>> ---
>> kernel/sched/fair.c | 3 ++-
>> 1 file changed, 2 insertions(+), 1 deletion(-)
>>
>> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
>> index 9b3fe1c..507a8a9 100644
>> --- a/kernel/sched/fair.c
>> +++ b/kernel/sched/fair.c
>> @@ -4896,7 +4896,8 @@ static inline void calculate_imbalance(struct lb_env *env, struct sd_lb_stats *s
>> * max load less than avg load(as we skip the groups at or below
>> * its cpu_power, while calculating max_load..)
>> */
>> - if (busiest->avg_load < sds->avg_load) {
>> + if (busiest->avg_load <= sds->avg_load ||
>> + local->avg_load >= sds->avg_load) {
>> env->imbalance = 0;
>> return fix_small_imbalance(env, sds);
>> }
> Why the = part? Surely 'busiest->avg_load < sds->avg_load ||
> local->avg_load > sds->avg_load' avoids both underflows?
Of course it does, but env->imbalance will be assigned to 0 anyway in =
case, so why not go shortcut?
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists