[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAKfTPtCTcrq1E1H8A3TL1xvALUrQ7ybPoERJ+C2O2+QXpVEZGQ@mail.gmail.com>
Date: Fri, 15 Nov 2019 11:18:00 +0100
From: Vincent Guittot <vincent.guittot@...aro.org>
To: Peter Zijlstra <peterz@...radead.org>
Cc: linux-kernel <linux-kernel@...r.kernel.org>,
Ingo Molnar <mingo@...hat.com>,
Dietmar Eggemann <dietmar.eggemann@....com>,
Juri Lelli <juri.lelli@...hat.com>,
Steven Rostedt <rostedt@...dmis.org>,
Mel Gorman <mgorman@...e.de>,
Doug Smythies <dsmythies@...us.net>,
"open list:THERMAL" <linux-pm@...r.kernel.org>,
Linus Torvalds <torvalds@...ux-foundation.org>,
Thomas Gleixner <tglx@...utronix.de>,
Sargun Dhillon <sargun@...gun.me>, Tejun Heo <tj@...nel.org>,
Xie XiuQi <xiexiuqi@...wei.com>, xiezhipeng1@...wei.com,
Srinivas Pandruvada <srinivas.pandruvada@...ux.intel.com>
Subject: Re: [PATCH v4] sched/freq: move call to cpufreq_update_util
On Fri, 15 Nov 2019 at 10:55, Peter Zijlstra <peterz@...radead.org> wrote:
>
> On Thu, Nov 14, 2019 at 06:07:31PM +0100, Vincent Guittot wrote:
> > update_cfs_rq_load_avg() calls cfs_rq_util_change() everytime pelt decays,
> > which might be inefficient when cpufreq driver has rate limitation.
> >
> > When a task is attached on a CPU, we have call path:
> >
> > update_load_avg()
> > update_cfs_rq_load_avg()
> > cfs_rq_util_change -- > trig frequency update
> > attach_entity_load_avg()
> > cfs_rq_util_change -- > trig frequency update
> >
> > The 1st frequency update will not take into account the utilization of the
> > newly attached task and the 2nd one might be discard because of rate
> > limitation of the cpufreq driver.
>
> Doesn't this just show that a dumb rate limit in the driver is broken?
But the rate limit may come from HW constraints that forces to wait
let say 4ms or even 10ms between each frequency update.
>
> > update_cfs_rq_load_avg() is only called by update_blocked_averages()
> > and update_load_avg() so we can move the call to
> > cfs_rq_util_change/cpufreq_update_util() into these 2 functions. It's also
> > interesting to notice that update_load_avg() already calls directly
> > cfs_rq_util_change() for !SMP case.
> >
> > This changes will also ensure that cpufreq_update_util() is called even
> > when there is no more CFS rq in the leaf_cfs_rq_list to update but only
> > irq, rt or dl pelt signals.
>
> I don't think it does that; that is, iirc the return value of
> ___update_load_sum() is 1 every time a period lapses. So even if the avg
> is 0 and doesn't change, it'll still return 1 on every period.
>
> Which is what that dumb rate-limit thing wants of course. But I'm still
> thinking that it's stupid to do. If nothing changes, don't generate
> events.
When everything (irq, dl, rt, cfs) is 0, we don't generate events
because update_blocked_averages is no more called because
rq->has_blocked_load is clear
With current implementation, if cfs is 0 but not irq, dl or rt, we
don't call cpufreq_update_util because it is only called through cfs
>
> If anything, update_blocked_avgerages() should look at
> @done/others_have_blocked() to emit events for rt,dl,irq.
other_have_blocked can be set but no decay happened during the update
and we don't need to call cpufreq_update_util
>
> So why are we making the scheduler code more ugly instead of fixing that
> driver?
Powered by blists - more mailing lists