[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20200305152539.dcgmq5zupnarn4rp@e107158-lin.cambridge.arm.com>
Date: Thu, 5 Mar 2020 15:25:39 +0000
From: Qais Yousef <qais.yousef@....com>
To: Daniel Lezcano <daniel.lezcano@...aro.org>
Cc: Ingo Molnar <mingo@...hat.com>,
Peter Zijlstra <peterz@...radead.org>,
Juri Lelli <juri.lelli@...hat.com>,
Vincent Guittot <vincent.guittot@...aro.org>,
Dietmar Eggemann <dietmar.eggemann@....com>,
Steven Rostedt <rostedt@...dmis.org>,
Ben Segall <bsegall@...gle.com>, Mel Gorman <mgorman@...e.de>,
"open list:SCHEDULER" <linux-kernel@...r.kernel.org>,
"Rafael J. Wysocki" <rjw@...ysocki.net>
Subject: Re: [PATCH] sched: fair: Use the earliest break even
On 03/04/20 17:17, Daniel Lezcano wrote:
>
> Hi Qais,
>
> On 04/03/2020 16:01, Qais Yousef wrote:
> > Hi Daniel
> >
> > Adding Rafael to CC as I think he might be interested in this too.
> >
> > On 03/04/20 12:48, Daniel Lezcano wrote:
> >> In the idle CPU selection process occuring in the slow path via the
> >> find_idlest_group_cpu() function, we pick up in priority an idle CPU
> >> with the shallowest idle state otherwise we fall back to the least
> >> loaded CPU.
> >>
> >> In order to be more energy efficient but without impacting the
> >> performances, let's use another criteria: the break even deadline.
> >
> > What is the break even deadline?
> >
> >> At idle time, when we store the idle state the CPU is entering in, we
> >> compute the next deadline where the CPU could be woken up without
> >> spending more energy to sleep.
> >
> > I think that's its definition, but could do with more explanation.
> >
> > So the break even deadline is the time window during which we can abort the CPU
> > while entering its shallowest idle state?
>
> No, it is the moment in absolute time when the min residency is reached.
>
> From Documentation/devicetree/bindings/arm/idle-states.yaml
>
> "
> min-residency is defined for a given idle state as the minimum expected
> residency time for a state (inclusive of preparation and entry) after
> which choosing that state become the most energy efficient option. A
> good way to visualise this, is by taking the same graph above and
> comparing some states energy consumptions plots.
> "
>
> > So if we have 2 cpus entering the shallowest idle state, we pick the one that
> > has a faster abort? And the energy saving comes from the fact we avoided
> > unnecessary sleep-just-to-wakeup-immediately cycle?
>
> No, actually it is about to choose a CPU where it has a better chance to
> have reach its min residency. Basically, when the CPU enters an idle
> state, that has a cost (cache flush / refill, context saving/restore etc
> ...), so there is a peak of energy and the CPU has to save energy long
> enough to compensate this extra consumption.
>
> If the scheduler is constantly waking up an idle CPU before it slept
> long enough, we lose energy and performance.
>
> Example 1, the CPUs are in a state:
> - CPU0 (power down)
> - CPU1 (power down)
> - CPU2 (WFI)
> - CPU3 (power down)
>
> The routine choose CPU2 because it is the shallowest state.
>
> Example 2, the CPUs are in a state:
> - CPU0 (power down) (bedl = 1234)
> - CPU1 (power down) (bedl = 4321)
> - CPU2 (power down) (bedl = 9876)
> - CPU3 (power down) (bedl = 3421)
>
> * bedl = break even deadline
>
> The routine choose CPU1 because the bedl is the smallest.
Thanks for the explanation and the reference. The idle-state.yaml has a nice
diagram. I won't insist, but I think including the definition of break even
will help a lot to make the patch more understandable.
[...]
> > Shouldn't you retain the original if condition here? You omitted the 2nd part
> > of this check compared to the original condition
> >
> > (!idle || >>>idle->exit_latency == min_exit_latency<<<)
>
> It is done on purpose because of the condition right before.
I see it now. If denote the condition as B. If it was false then the OR will
reduce to !idle. But if it was True, then which path you take will depend only
on state of idle too, which you check in both paths.
[...]
>
> Nope, thanks for spotting it. It should be:
>
> void sched_idle_set_state(struct cpuidle_state *idle_state)
> {
> - idle_set_state(this_rq(), idle_state);
> + struct rq *rq = this_rq();
> + idle_set_state(rq, idle_state);
> +
> + if (likely(idle_state)) {
> + ktime_t kt = ktime_add_ns(ktime_get(),
> + idle_state->exit_latency_ns);
> + idle_set_break_even(rq, ktime_to_ns(kt));
> + }
> }
>
>
> > Don't you need to reset the break_even value if idle_state is NULL too?
>
> If the idle_state is NULL, the routine won't use the break_even.
+1
Thanks
--
Qais Yousef
Powered by blists - more mailing lists