[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <50FE0575.6090005@intel.com>
Date: Tue, 22 Jan 2013 11:20:21 +0800
From: Alex Shi <alex.shi@...el.com>
To: Paul Turner <pjt@...gle.com>, Ingo Molnar <mingo@...hat.com>,
Peter Zijlstra <peterz@...radead.org>,
Linus Torvalds <torvalds@...ux-foundation.org>
CC: Thomas Gleixner <tglx@...utronix.de>,
Andrew Morton <akpm@...ux-foundation.org>,
Arjan van de Ven <arjan@...ux.intel.com>,
Borislav Petkov <bp@...en8.de>, namhyung@...nel.org,
Mike Galbraith <efault@....de>,
Vincent Guittot <vincent.guittot@...aro.org>,
Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
preeti@...ux.vnet.ibm.com,
Linux Kernel Mailing List <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH v3 09/22] sched: compute runnable load avg in cpu_load
and cpu_avg_load_per_task
>>>>
>>>> I just looked into the aim9 benchmark, in this case it forks 2000 tasks,
>>>> after all tasks ready, aim9 give a signal than all tasks burst waking up
>>>> and run until all finished.
>>>> Since each of tasks are finished very quickly, a imbalanced empty cpu
>>>> may goes to sleep till a regular balancing give it some new tasks. That
>>>> causes the performance dropping. cause more idle entering.
>>>
>>> Sounds like for AIM (and possibly for other really bursty loads), we
>>> might want to do some load-balancing at wakeup time by *just* looking
>>> at the number of running tasks, rather than at the load average. Hmm?
>>>
>>> The load average is fundamentally always going to run behind a bit,
>>> and while you want to use it for long-term balancing, a short-term you
>>> might want to do just a "if we have a huge amount of runnable
>>> processes, do a load balancing *now*". Where "huge amount" should
>>> probably be relative to the long-term load balancing (ie comparing the
>>> number of runnable processes on this CPU right *now* with the load
>>> average over the last second or so would show a clear spike, and a
>>> reason for quick action).
>>>
>>
>> Sorry for response late!
>>
>> Just written a patch following your suggestion, but no clear improvement for this case.
>> I also tried change the burst checking interval, also no clear help.
>>
>> If I totally give up runnable load in periodic balancing, the performance can recover 60%
>> of lose.
>>
>> I will try to optimize wake up balancing in weekend.
>>
>
> (btw, the time for runnable avg to accumulate to 100%, needs 345ms; to
> 50% needs 32 ms)
>
> I have tried some tuning in both wake up balancing and regular
> balancing. Yes, when using instant load weight (without runnable avg
> engage), both in waking up, and regular balance, the performance recovered.
>
> But with per_cpu nr_running tracking, it's hard to find a elegant way to
> detect the burst whenever in waking up or in regular balance.
> In waking up, the whole sd_llc domain cpus are candidates, so just
> checking this_cpu is not enough.
> In regular balance, this_cpu is the migration destination cpu, checking
> if the burst on the cpu is not useful. Instead, we need to check whole
> domains' increased task number.
>
> So, guess 2 solutions for this issue.
> 1, for quick waking up, we need use instant load(same as current
> balancing) to do balance; and for regular balance, we can record both
> instant load and runnable load data for whole domain, then decide which
> one to use according to task number increasing in the domain after
> tracking done the whole domain.
>
> 2, we can keep current instant load balancing as performance balance
> policy, and using runnable load balancing in power friend policy.
> Since, none of us find performance benefit with runnable load balancing
> on benchmark hackbench/kbuild/aim9/tbench/specjbb etc.
> I prefer the 2nd.
3, On the other hand, Considering the aim9 testing scenario is rare in
real life(prepare thousands tasks and then wake up them at the same
time). And the runnable load avg includes useful running history info.
Only aim9 5~7% performance dropping is not unacceptable.
(kbuild/hackbench/tbench/specjbb have no clear performance change)
So we can let this drop be with a reminder in code. Any comments?
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists