[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <32f4a76d-103e-510f-de70-ba9dfe2356ce@arm.com>
Date: Mon, 14 Nov 2022 20:13:47 +0100
From: Dietmar Eggemann <dietmar.eggemann@....com>
To: Vincent Guittot <vincent.guittot@...aro.org>, mingo@...hat.com,
peterz@...radead.org, juri.lelli@...hat.com, rostedt@...dmis.org,
bsegall@...gle.com, mgorman@...e.de, bristot@...hat.com,
vschneid@...hat.com, linux-kernel@...r.kernel.org,
parth@...ux.ibm.com
Cc: qyousef@...alina.io, chris.hyser@...cle.com,
patrick.bellasi@...bug.net, David.Laight@...lab.com,
pjt@...gle.com, pavel@....cz, tj@...nel.org, qperret@...gle.com,
tim.c.chen@...ux.intel.com, joshdon@...gle.com, timj@....org,
kprateek.nayak@....com, yu.c.chen@...el.com,
youssefesmat@...omium.org, joel@...lfernandes.org
Subject: Re: [PATCH v8 1/9] sched/fair: fix unfairness at wakeup
On 10/11/2022 18:50, Vincent Guittot wrote:
> At wake up, the vruntime of a task is updated to not be more older than
> a sched_latency period behind the min_vruntime. This prevents long sleeping
> task to get unlimited credit at wakeup.
> Such waking task should preempt current one to use its CPU bandwidth but
> wakeup_gran() can be larger than sched_latency, filter out the
> wakeup preemption and as a results steals some CPU bandwidth to
> the waking task.
>
> Make sure that a task, which vruntime has been capped, will preempt current
> task and use its CPU bandwidth even if wakeup_gran() is in the same range
> as sched_latency.
Looks like that gran can be nuch higher than sched_latency for extreme
cases?
>
> If the waking task failed to preempt current it could to wait up to
> sysctl_sched_min_granularity before preempting it during next tick.
>
> Strictly speaking, we should use cfs->min_vruntime instead of
> curr->vruntime but it doesn't worth the additional overhead and complexity
> as the vruntime of current should be close to min_vruntime if not equal.
^^^ Does this related to the `if (vdiff > gran) return 1` condition in
wakeup_preempt_entity()?
[...]
> @@ -7187,6 +7171,18 @@ wakeup_preempt_entity(struct sched_entity *curr, struct sched_entity *se)
> return -1;
>
> gran = wakeup_gran(se);
> +
> + /*
> + * At wake up, the vruntime of a task is capped to not be older than
> + * a sched_latency period compared to min_vruntime. This prevents long
> + * sleeping task to get unlimited credit at wakeup. Such waking up task
> + * has to preempt current in order to not lose its share of CPU
> + * bandwidth but wakeup_gran() can become higher than scheduling period
> + * for low priority task. Make sure that long sleeping task will get a
low priority task or taskgroup with low cpu.shares, right?
6 CPUs
sysctl_sched
.sysctl_sched_latency : 18.000000
.sysctl_sched_min_granularity : 2.250000
.sysctl_sched_idle_min_granularity : 0.750000
.sysctl_sched_wakeup_granularity : 3.000000
...
p1 & p2 affine to CPUX
'/'
/\
p1 p2
p1 & p2 nice=0 - vdiff=9ms gran=3ms lat_max=6.75ms
p1 & p2 nice=4 - vdiff=9ms gran=7.26ms lat_max=6.75ms
p1 & p2 nice=19 - vdiff=9ms gran=204.79ms lat_max=6.75ms
'/'
/\
A B
/ \
p1 p2
A & B cpu.shares=1024 - vdiff=9ms gran=3ms lat_max=6.75ms
A & B cpu.shares=448 - vdiff=9ms gran=6.86ms lat_max=6.75ms
A & B cpu.shares=2 - vdiff=9ms gran=1536ms lat_max=6.75ms
> + * chance to preempt current.
> + */
> + gran = min_t(s64, gran, get_latency_max());
> +
[...]
> @@ -2448,6 +2448,34 @@ extern unsigned int sysctl_numa_balancing_scan_period_max;
> extern unsigned int sysctl_numa_balancing_scan_size;
> #endif
>
> +static inline unsigned long get_sched_latency(bool idle)
^^
2 white-spaces
[...]
> +
> +static inline unsigned long get_latency_max(void)
^^
[...]
Powered by blists - more mailing lists