lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <BANLkTin4aQ7Gc+h0YCPo=Eokc2iz06WNbw@mail.gmail.com>
Date:	Fri, 8 Apr 2011 12:29:33 -0700
From:	Ken Chen <kenchen@...gle.com>
To:	Peter Zijlstra <a.p.zijlstra@...llo.nl>
Cc:	mingo@...e.hu, linux-kernel@...r.kernel.org
Subject: Re: [PATCH] sched: fix sched-domain avg_load calculation.

On Fri, Apr 8, 2011 at 4:15 AM, Peter Zijlstra <a.p.zijlstra@...llo.nl> wrote:
> On Thu, 2011-04-07 at 17:23 -0700, Ken Chen wrote:
>> In function find_busiest_group(), the sched-domain avg_load isn't
>> calculated at all if there is a group imbalance within the domain.
>> This will cause erroneous imbalance calculation.  The reason is
>> that calculate_imbalance() sees sds->avg_load = 0 and it will dump
>> entire sds->max_load into imbalance variable, which is used later
>> on to migrate entire load from busiest CPU to the puller CPU. It
>> has two really bad effect:
>>
>> 1. stampede of task migration, and they won't be able to break out
>>    of the bad state because of positive feedback loop: large load
>>    delta -> heavier load migration -> larger imbalance and the cycle
>>    goes on.
>>
>> 2. severe imbalance in CPU queue depth.  This causes really long
>>    scheduling latency blip which affects badly on application that
>>    has tight latency requirement.
>>
>> The fix is to have kernel calculate domain avg_load in both cases.
>> This will ensure that imbalance calculation is always sensible and
>> the target is usually half way between busiest and puller CPU.
>
> Indeed so, it looks like I broke that in 866ab43efd32. Out of curiosity,
> what kind of workload did you observe this on?

This was observed on application that serves websearch query.  There
were uneven CPU queue depth in the system, which leads to long query
latency tail.  The latency tail were both high in occurring frequency
as well as streched out in time.

With this fix, both server throughput and latency response were improved.

- Ken
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ