[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20230329080251.GK4253@hirez.programming.kicks-ass.net>
Date: Wed, 29 Mar 2023 10:02:51 +0200
From: Peter Zijlstra <peterz@...radead.org>
To: Josh Don <joshdon@...gle.com>
Cc: mingo@...nel.org, vincent.guittot@...aro.org,
linux-kernel@...r.kernel.org, juri.lelli@...hat.com,
dietmar.eggemann@....com, rostedt@...dmis.org, bsegall@...gle.com,
mgorman@...e.de, bristot@...hat.com, corbet@....net,
qyousef@...alina.io, chris.hyser@...cle.com,
patrick.bellasi@...bug.net, pjt@...gle.com, pavel@....cz,
qperret@...gle.com, tim.c.chen@...ux.intel.com, timj@....org,
kprateek.nayak@....com, yu.c.chen@...el.com,
youssefesmat@...omium.org, joel@...lfernandes.org, efault@....de
Subject: Re: [PATCH 08/17] sched/fair: Implement an EEVDF like policy
On Tue, Mar 28, 2023 at 06:26:51PM -0700, Josh Don wrote:
> Hi Peter,
>
> This is a really interesting proposal and in general I think the
> incorporation of latency/deadline is quite a nice enhancement. We've
> struggled for a while to get better latency bounds on performance
> sensitive threads in the face of antagonism from overcommit.
>
> > void update_entity_lag(struct cfs_rq *cfs_rq, struct sched_entity *se)
> > {
> > + s64 lag, limit;
> > +
> > SCHED_WARN_ON(!se->on_rq);
> > - se->vlag = avg_vruntime(cfs_rq) - se->vruntime;
> > + lag = avg_vruntime(cfs_rq) - se->vruntime;
> > +
> > + limit = calc_delta_fair(max_t(u64, 2*se->slice, TICK_NSEC), se);
> > + se->vlag = clamp(lag, -limit, limit);
>
> This is for dequeue; presumably you'd want to update the vlag at
> enqueue in case the average has moved again due to enqueue/dequeue of
> other entities?
Ha, just adding the entry back will shift the avgerage around and it's
all a giant pain in the backside.
place_entity(struct cfs_rq *cfs_rq, struct sched_entity *se, int initial)
{
u64 vruntime = avg_vruntime(cfs_rq);
+ s64 lag = 0;
+ /*
+ * Due to how V is constructed as the weighted average of entities,
+ * adding tasks with positive lag, or removing tasks with negative lag
+ * will move 'time' backwards, this can screw around with the lag of
+ * other tasks.
+ *
+ * EEVDF: placement strategy #1 / #2
+ */
+ if (sched_feat(PLACE_LAG) && cfs_rq->nr_running > 1) {
+ struct sched_entity *curr = cfs_rq->curr;
+ unsigned long load;
+ lag = se->vlag;
/*
+ * If we want to place a task and preserve lag, we have to
+ * consider the effect of the new entity on the weighted
+ * average and compensate for this, otherwise lag can quickly
+ * evaporate:
+ *
+ * l_i = V - v_i <=> v_i = V - l_i
+ *
+ * V = v_avg = W*v_avg / W
+ *
+ * V' = (W*v_avg + w_i*v_i) / (W + w_i)
+ * = (W*v_avg + w_i(v_avg - l_i)) / (W + w_i)
+ * = v_avg + w_i*l_i/(W + w_i)
+ *
+ * l_i' = V' - v_i = v_avg + w_i*l_i/(W + w_i) - (v_avg - l)
+ * = l_i - w_i*l_i/(W + w_i)
+ *
+ * l_i = (W + w_i) * l_i' / W
*/
+ load = cfs_rq->avg_load;
+ if (curr && curr->on_rq)
+ load += curr->load.weight;
+
+ lag *= load + se->load.weight;
+ if (WARN_ON_ONCE(!load))
+ load = 1;
+ lag = div_s64(lag, load);
+ vruntime -= lag;
}
That ^ is the other side of it.
But yes, once enqueued, additional join/leave operations can/will shift
V around and lag changes, nothing much to do about that.
The paper does it all a wee bit differently, but I think it ends up
being the same. They explicitly track V (and shift it around on
join/leave) while I implicitly track it through the average and then
need to play games like the above, but in the end it should be the same.
Powered by blists - more mailing lists