[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAKfTPtAkFBw5zt0+WK7dWBUE9OrbOOExG8ueUE6ogdCEQZhpXQ@mail.gmail.com>
Date: Fri, 31 Mar 2023 17:26:51 +0200
From: Vincent Guittot <vincent.guittot@...aro.org>
To: Peter Zijlstra <peterz@...radead.org>
Cc: mingo@...nel.org, linux-kernel@...r.kernel.org,
juri.lelli@...hat.com, dietmar.eggemann@....com,
rostedt@...dmis.org, bsegall@...gle.com, mgorman@...e.de,
bristot@...hat.com, corbet@....net, qyousef@...alina.io,
chris.hyser@...cle.com, patrick.bellasi@...bug.net, pjt@...gle.com,
pavel@....cz, qperret@...gle.com, tim.c.chen@...ux.intel.com,
joshdon@...gle.com, timj@....org, kprateek.nayak@....com,
yu.c.chen@...el.com, youssefesmat@...omium.org,
joel@...lfernandes.org, efault@....de
Subject: Re: [PATCH 14/17] sched/eevdf: Better handle mixed slice length
On Tue, 28 Mar 2023 at 13:06, Peter Zijlstra <peterz@...radead.org> wrote:
>
> In the case where (due to latency-nice) there are different request
> sizes in the tree, the smaller requests tend to be dominated by the
> larger. Also note how the EEVDF lag limits are based on r_max.
>
> Therefore; add a heuristic that for the mixed request size case, moves
> smaller requests to placement strategy #2 which ensures they're
> immidiately eligible and and due to their smaller (virtual) deadline
> will cause preemption.
>
> NOTE: this relies on update_entity_lag() to impose lag limits above
> a single slice.
>
> Signed-off-by: Peter Zijlstra (Intel) <peterz@...radead.org>
> ---
> kernel/sched/fair.c | 14 ++++++++++++++
> kernel/sched/features.h | 1 +
> kernel/sched/sched.h | 1 +
> 3 files changed, 16 insertions(+)
>
> --- a/kernel/sched/fair.c
> +++ b/kernel/sched/fair.c
> @@ -616,6 +616,7 @@ avg_vruntime_add(struct cfs_rq *cfs_rq,
> s64 key = entity_key(cfs_rq, se);
>
> cfs_rq->avg_vruntime += key * weight;
> + cfs_rq->avg_slice += se->slice * weight;
> cfs_rq->avg_load += weight;
> }
>
> @@ -626,6 +627,7 @@ avg_vruntime_sub(struct cfs_rq *cfs_rq,
> s64 key = entity_key(cfs_rq, se);
>
> cfs_rq->avg_vruntime -= key * weight;
> + cfs_rq->avg_slice -= se->slice * weight;
> cfs_rq->avg_load -= weight;
> }
>
> @@ -4832,6 +4834,18 @@ place_entity(struct cfs_rq *cfs_rq, stru
> lag = se->vlag;
>
> /*
> + * For latency sensitive tasks; those that have a shorter than
> + * average slice and do not fully consume the slice, transition
> + * to EEVDF placement strategy #2.
> + */
> + if (sched_feat(PLACE_FUDGE) &&
> + cfs_rq->avg_slice > se->slice * cfs_rq->avg_load) {
> + lag += vslice;
> + if (lag > 0)
> + lag = 0;
By using different lag policies for tasks, doesn't this create
unfairness between tasks ?
I wanted to stress this situation with a simple use case but it seems
that even without changing the slice, there is a fairness problem:
Task A always run
Task B loops on : running 1ms then sleeping 1ms
default nice and latency nice prio bot both
each task should get around 50% of the time.
The fairness is ok with tip/sched/core
but with eevdf, Task B only gets around 30%
I haven't identified the problem so far
> + }
> +
> + /*
> * If we want to place a task and preserve lag, we have to
> * consider the effect of the new entity on the weighted
> * average and compensate for this, otherwise lag can quickly
> --- a/kernel/sched/features.h
> +++ b/kernel/sched/features.h
> @@ -5,6 +5,7 @@
> * sleep+wake cycles. EEVDF placement strategy #1, #2 if disabled.
> */
> SCHED_FEAT(PLACE_LAG, true)
> +SCHED_FEAT(PLACE_FUDGE, true)
> SCHED_FEAT(PLACE_DEADLINE_INITIAL, true)
>
> /*
> --- a/kernel/sched/sched.h
> +++ b/kernel/sched/sched.h
> @@ -559,6 +559,7 @@ struct cfs_rq {
> unsigned int idle_h_nr_running; /* SCHED_IDLE */
>
> s64 avg_vruntime;
> + u64 avg_slice;
> u64 avg_load;
>
> u64 exec_clock;
>
>
Powered by blists - more mailing lists