[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <570E61FE.4060000@arm.com>
Date: Wed, 13 Apr 2016 16:13:02 +0100
From: Dietmar Eggemann <dietmar.eggemann@....com>
To: Yuyang Du <yuyang.du@...el.com>, peterz@...radead.org,
mingo@...nel.org, linux-kernel@...r.kernel.org
Cc: bsegall@...gle.com, pjt@...gle.com, morten.rasmussen@....com,
vincent.guittot@...aro.org, juri.lelli@....com
Subject: Re: [PATCH 2/4] sched/fair: Drop out incomplete current period when
sched averages accrue
On 10/04/16 23:36, Yuyang Du wrote:
> In __update_load_avg(), the current period is never complete. This
> basically leads to a slightly over-decayed average, say on average we
> have 50% current period, then we will lose 1.08%(=(1-0.5^(1/64)) of
> past avg. More importantly, the incomplete current period significantly
> complicates the avg computation, even a full period is only about 1ms.
>
> So we attempt to drop it. The outcome is that for any x.y periods to
> update, we will either lose the .y period or unduely gain (1-.y) period.
> How big is the impact? For a large x (say 20ms), you barely notice the
> difference, which is plus/minus 1% (=(before-after)/before). Moreover,
> the aggregated losses and gains in the long run should statistically
> even out.
>
For a periodic task, the signals really get much more unstable. Even for
a steady state (load/util related) periodic task there is a meander
pattern which depends on if we for instance hit a dequeue (decay +
accrue) or an enqueue (decay only) after the 1ms has elapsed.
IMHO, 1ms is too big to create signals describing task and cpu load/util
signals given the current scheduler dynamics. We simply see too many
signal driving points (e.g. enqueue/dequeue) bailing out of
__update_load_avg().
Examples of 1 periodic task pinned to a cpu on an ARM64 system, HZ=250
in steady state:
(1) task runtime = 100us period = 200us
pelt load/util signal
1us: 488-491
1ms: 483-534
We get ~2 dequeues (load/util example: 493->504) and ~2 enqueues
(load/util example: 496->483) in the meander pattern in the 1ms case.
(2) task runtime = 100us period = 1000us
pelt load/util signal
1us: 103-105
1ms: 84-145
We get ~3-4 dequeues (load/util example: 104->124->134->140) and ~16-20
enqueues (load/util example: 137->134->...->99->97) in the meander
pattern in the 1ms case.
[...]
Powered by blists - more mailing lists