[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAKfTPtC5f7jfz+=rLQp_gjaEqGQ=9B-4aX-4urZP6CPVEf1LwA@mail.gmail.com>
Date: Tue, 15 Nov 2022 08:26:05 +0100
From: Vincent Guittot <vincent.guittot@...aro.org>
To: Dietmar Eggemann <dietmar.eggemann@....com>
Cc: mingo@...hat.com, peterz@...radead.org, juri.lelli@...hat.com,
rostedt@...dmis.org, bsegall@...gle.com, mgorman@...e.de,
bristot@...hat.com, vschneid@...hat.com,
linux-kernel@...r.kernel.org, parth@...ux.ibm.com,
qyousef@...alina.io, chris.hyser@...cle.com,
patrick.bellasi@...bug.net, David.Laight@...lab.com,
pjt@...gle.com, pavel@....cz, tj@...nel.org, qperret@...gle.com,
tim.c.chen@...ux.intel.com, joshdon@...gle.com, timj@....org,
kprateek.nayak@....com, yu.c.chen@...el.com,
youssefesmat@...omium.org, joel@...lfernandes.org
Subject: Re: [PATCH v8 1/9] sched/fair: fix unfairness at wakeup
On Mon, 14 Nov 2022 at 20:13, Dietmar Eggemann <dietmar.eggemann@....com> wrote:
>
> On 10/11/2022 18:50, Vincent Guittot wrote:
> > At wake up, the vruntime of a task is updated to not be more older than
> > a sched_latency period behind the min_vruntime. This prevents long sleeping
> > task to get unlimited credit at wakeup.
> > Such waking task should preempt current one to use its CPU bandwidth but
> > wakeup_gran() can be larger than sched_latency, filter out the
> > wakeup preemption and as a results steals some CPU bandwidth to
> > the waking task.
> >
> > Make sure that a task, which vruntime has been capped, will preempt current
> > task and use its CPU bandwidth even if wakeup_gran() is in the same range
> > as sched_latency.
>
> Looks like that gran can be nuch higher than sched_latency for extreme
> cases?
It's not that extreme, all tasks with nice prio 5 and above will face
the problem
>
> >
> > If the waking task failed to preempt current it could to wait up to
> > sysctl_sched_min_granularity before preempting it during next tick.
> >
> > Strictly speaking, we should use cfs->min_vruntime instead of
> > curr->vruntime but it doesn't worth the additional overhead and complexity
> > as the vruntime of current should be close to min_vruntime if not equal.
>
> ^^^ Does this related to the `if (vdiff > gran) return 1` condition in
> wakeup_preempt_entity()?
yes
>
> [...]
>
> > @@ -7187,6 +7171,18 @@ wakeup_preempt_entity(struct sched_entity *curr, struct sched_entity *se)
> > return -1;
> >
> > gran = wakeup_gran(se);
> > +
> > + /*
> > + * At wake up, the vruntime of a task is capped to not be older than
> > + * a sched_latency period compared to min_vruntime. This prevents long
> > + * sleeping task to get unlimited credit at wakeup. Such waking up task
> > + * has to preempt current in order to not lose its share of CPU
> > + * bandwidth but wakeup_gran() can become higher than scheduling period
> > + * for low priority task. Make sure that long sleeping task will get a
>
> low priority task or taskgroup with low cpu.shares, right?
yes
>
> 6 CPUs
>
> sysctl_sched
> .sysctl_sched_latency : 18.000000
> .sysctl_sched_min_granularity : 2.250000
> .sysctl_sched_idle_min_granularity : 0.750000
> .sysctl_sched_wakeup_granularity : 3.000000
> ...
>
> p1 & p2 affine to CPUX
>
> '/'
> /\
> p1 p2
>
> p1 & p2 nice=0 - vdiff=9ms gran=3ms lat_max=6.75ms
> p1 & p2 nice=4 - vdiff=9ms gran=7.26ms lat_max=6.75ms
p1 & p2 nice = 5 - vdiff=9ms gran=9.17ms lat_max=6.75ms
> p1 & p2 nice=19 - vdiff=9ms gran=204.79ms lat_max=6.75ms
>
>
> '/'
> /\
> A B
> / \
> p1 p2
>
> A & B cpu.shares=1024 - vdiff=9ms gran=3ms lat_max=6.75ms
> A & B cpu.shares=448 - vdiff=9ms gran=6.86ms lat_max=6.75ms
> A & B cpu.shares=2 - vdiff=9ms gran=1536ms lat_max=6.75ms
>
> > + * chance to preempt current.
> > + */
> > + gran = min_t(s64, gran, get_latency_max());
> > +
>
> [...]
>
> > @@ -2448,6 +2448,34 @@ extern unsigned int sysctl_numa_balancing_scan_period_max;
> > extern unsigned int sysctl_numa_balancing_scan_size;
> > #endif
> >
> > +static inline unsigned long get_sched_latency(bool idle)
> ^^
> 2 white-spaces
ok
>
> [...]
>
> > +
> > +static inline unsigned long get_latency_max(void)
> ^^
ok
>
> [...]
Powered by blists - more mailing lists