[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <516651C8.307@linux.vnet.ibm.com>
Date: Thu, 11 Apr 2013 14:01:44 +0800
From: Michael Wang <wangyun@...ux.vnet.ibm.com>
To: Peter Zijlstra <peterz@...radead.org>,
Peter Zijlstra <a.p.zijlstra@...llo.nl>
CC: LKML <linux-kernel@...r.kernel.org>,
Ingo Molnar <mingo@...nel.org>, Mike Galbraith <efault@....de>,
Alex Shi <alex.shi@...el.com>,
Namhyung Kim <namhyung@...nel.org>,
Paul Turner <pjt@...gle.com>,
Andrew Morton <akpm@...ux-foundation.org>,
"Nikunj A. Dadhania" <nikunj@...ux.vnet.ibm.com>,
Ram Pai <linuxram@...ibm.com>
Subject: Re: [PATCH] sched: wake-affine throttle
On 04/10/2013 05:22 PM, Michael Wang wrote:
> Hi, Peter
>
> Thanks for your reply :)
>
> On 04/10/2013 04:51 PM, Peter Zijlstra wrote:
>> On Wed, 2013-04-10 at 11:30 +0800, Michael Wang wrote:
>>> | 15 GB | 32 | 35918 | | 37632 | +4.77% | 47923 | +33.42% |
>>> 52241 | +45.45%
>>
>> So I don't get this... is wake_affine() once every milisecond _that_
>> expensive?
>>
>> Seeing we get a 45%!! improvement out of once every 100ms that would
>> mean we're like spending 1/3rd of our time in wake_affine()? that's
>> preposterous. So what's happening?
>
> Not all the regression was caused by overhead, adopt curr_cpu not
> prev_cpu for select_idle_sibling() is a more important reason for the
> regression of pgbench.
>
> In other word, for pgbench, we waste time in wake_affine() and make the
> wrong decision at most of the time, the previously patch show
> wake_affine() do pull unrelated tasks together, that's good if current
> cpu still cached hot data for wakee, but that's not the case of the
> workload like pgbench.
Please let me know if I failed to express my thought clearly.
I know it's hard to figure out why throttle could bring so many benefit,
since the wake-affine stuff is a black box with too many unmeasurable
factors, but that's actually the reason why we finally figure out this
throttle idea, not the approach like wakeup-buddy, although both of them
help to stop the regression.
It's fortunate that there is a benchmark could help to find out the
regression, and now we have a simple and efficient approach ready for
action ;-)
Regards,
Michael Wang
>
> The workload just don't satisfied the decision changed by wake-affine,
> the more wake-affine active, the more it suffered, that's why 100ms show
> better results than 1ms, but when reached some rate, the benefit and
> lost of wake-affine will be balanced.
>
> Regards,
> Michael Wang
>
>>
>>
>>
>
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists