[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CALZhoSTNg+QWY4VMH97oF8ReJx_nVY7Ux7iEM=66XvNmdbdH6w@mail.gmail.com>
Date: Mon, 17 Jun 2013 20:26:55 +0800
From: Lei Wen <adrian.wenl@...il.com>
To: Peter Zijlstra <peterz@...radead.org>
Cc: Alex Shi <alex.shi@...el.com>, mingo@...hat.com,
tglx@...utronix.de, akpm@...ux-foundation.org, bp@...en8.de,
pjt@...gle.com, namhyung@...nel.org, efault@....de,
morten.rasmussen@....com, vincent.guittot@...aro.org,
preeti@...ux.vnet.ibm.com, viresh.kumar@...aro.org,
linux-kernel@...r.kernel.org, mgorman@...e.de, riel@...hat.com,
wangyun@...ux.vnet.ibm.com, Jason Low <jason.low2@...com>,
Changlong Xie <changlongx.xie@...el.com>, sgruszka@...hat.com,
fweisbec@...il.com
Subject: Re: [patch v8 3/9] sched: set initial value of runnable avg for new
forked task
Hi Peter,
On Mon, Jun 17, 2013 at 5:20 PM, Peter Zijlstra <peterz@...radead.org> wrote:
> On Fri, Jun 14, 2013 at 06:02:45PM +0800, Lei Wen wrote:
>> Hi Alex,
>>
>> On Fri, Jun 7, 2013 at 3:20 PM, Alex Shi <alex.shi@...el.com> wrote:
>> > We need initialize the se.avg.{decay_count, load_avg_contrib} for a
>> > new forked task.
>> > Otherwise random values of above variables cause mess when do new task
>> > enqueue:
>> > enqueue_task_fair
>> > enqueue_entity
>> > enqueue_entity_load_avg
>> >
>> > and make forking balancing imbalance since incorrect load_avg_contrib.
>> >
>> > Further more, Morten Rasmussen notice some tasks were not launched at
>> > once after created. So Paul and Peter suggest giving a start value for
>> > new task runnable avg time same as sched_slice().
>>
>> I am confused at this comment, how set slice to runnable avg would change
>> the behavior of "some tasks were not launched at once after created"?
>>
>> IMHO, I could only tell that for the new forked task, it could be run if current
>> task already be set as need_resched, and preempt_schedule or
>> preempt_schedule_irq
>> is called.
>>
>> Since the set slice to avg behavior would not affect this task's vruntime,
>> and hence cannot make current running task be need_sched, if
>> previously it cannot.
>
>
> So the 'problem' is that our running avg is a 'floating' average; ie. it
> decays with time. Now we have to guess about the future of our newly
> spawned task -- something that is nigh impossible seeing these CPU
> vendors keep refusing to implement the crystal ball instruction.
I am curious at this "crystal ball instruction" saying. :)
Could it be real? I mean what kind of hw mechanism could achieve such
magic power? What I see, for silicon vendor they could provide more
monitor unit, but to precise predict the sw's behavior, I don't think hw
also this kind of power...
>
> So there's two asymptotic cases we want to deal well with; 1) the case
> where the newly spawned program will be 'nearly' idle for its lifetime;
> and 2) the case where its cpu-bound.
>
> Since we have to guess, we'll go for worst case and assume its
> cpu-bound; now we don't want to make the avg so heavy adjusting to the
> near-idle case takes forever. We want to be able to quickly adjust and
> lower our running avg.
>
> Now we also don't want to make our avg too light, such that it gets
> decremented just for the new task not having had a chance to run yet --
> even if when it would run, it would be more cpu-bound than not.
>
> So what we do is we make the initial avg of the same duration as that we
> guess it takes to run each task on the system at least once -- aka
> sched_slice().
>
> Of course we can defeat this with wakeup/fork bombs, but in the 'normal'
> case it should be good enough.
>
>
> Does that make sense?
Thanks for your detailed explanation. Very useful indeed! :)
BTW, I have no question for the patch itself, but just confuse at the
patch's comment
"some tasks were not launched at once after created".
Thanks,
Lei
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists