lists.openwall.net | lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC | |
Open Source and information security mailing list archives
| ||
|
Date: Thu, 6 Oct 2022 13:34:57 +0200 From: Dietmar Eggemann <dietmar.eggemann@....com> To: Vincent Guittot <vincent.guittot@...aro.org>, mingo@...hat.com, peterz@...radead.org, juri.lelli@...hat.com, rostedt@...dmis.org, bsegall@...gle.com, mgorman@...e.de, bristot@...hat.com, vschneid@...hat.com, linux-kernel@...r.kernel.org Cc: zhangqiao22@...wei.com Subject: Re: [PATCH v3] sched/fair: limit sched slice duration On 03/10/2022 14:21, Vincent Guittot wrote: > In presence of a lot of small weight tasks like sched_idle tasks, normal > or high weight tasks can see their ideal runtime (sched_slice) to increase > to hundreds ms whereas it normally stays below sysctl_sched_latency. > > 2 normal tasks running on a CPU will have a max sched_slice of 12ms > (half of the sched_period). This means that they will make progress > every sysctl_sched_latency period. > > If we now add 1000 idle tasks on the CPU, the sched_period becomes > 3006 ms and the ideal runtime of the normal tasks becomes 609 ms. > It will even become 1500ms if the idle tasks belongs to an idle cgroup. > This means that the scheduler will look for picking another waiting task > after 609ms running time (1500ms respectively). The idle tasks change > significantly the way the 2 normal tasks interleave their running time > slot whereas they should have a small impact. > > Such long sched_slice can delay significantly the release of resources > as the tasks can wait hundreds of ms before the next running slot just > because of idle tasks queued on the rq. > > Cap the ideal_runtime to sysctl_sched_latency to make sure that tasks will > regularly make progress and will not be significantly impacted by > idle/background tasks queued on the rq. > > Signed-off-by: Vincent Guittot <vincent.guittot@...aro.org> > --- > > Change since v2: > - Cap ideal_runtime from the beg as suggested by Peter > > kernel/sched/fair.c | 8 +++++++- > 1 file changed, 7 insertions(+), 1 deletion(-) > > diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c > index 5ffec4370602..c309d57efb2c 100644 > --- a/kernel/sched/fair.c > +++ b/kernel/sched/fair.c > @@ -4584,7 +4584,13 @@ check_preempt_tick(struct cfs_rq *cfs_rq, struct sched_entity *curr) > struct sched_entity *se; > s64 delta; > > - ideal_runtime = sched_slice(cfs_rq, curr); > + /* > + * When many tasks blow up the sched_period; it is possible that > + * sched_slice() reports unusually large results (when many tasks are > + * very light for example). Therefore impose a maximum. > + */ > + ideal_runtime = min_t(u64, sched_slice(cfs_rq, curr), sysctl_sched_latency); > + > delta_exec = curr->sum_exec_runtime - curr->prev_sum_exec_runtime; > if (delta_exec > ideal_runtime) { > resched_curr(rq_of(cfs_rq)); Tested on 6 CPU system (sysctl_sched_latency=18ms, sysctl_sched_min_granularity=2.25ms) I start to see `slice > period` when I run: (a) > ~50 idle tasks in '/' for an arbitrary nice=0 task (b) > ~50 nice=0 tasks in '/A' w/ cpu.shares = max for se of '/A' Essentially in moments in which cfs_rq->nr_running > sched_nr_latency and se_weight is relatively high compared to cfs_rq_weight. Tested-By: Dietmar Eggemann <dietmar.eggemann@....com>
Powered by blists - more mailing lists