lists.openwall.net | lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC | |
Open Source and information security mailing list archives
| ||
|
Date: Thu, 20 Aug 2020 14:45:59 +0200 From: Vincent Guittot <vincent.guittot@...aro.org> To: benbjiang(蒋彪) <benbjiang@...cent.com> Cc: Dietmar Eggemann <dietmar.eggemann@....com>, Jiang Biao <benbjiang@...il.com>, "mingo@...hat.com" <mingo@...hat.com>, "peterz@...radead.org" <peterz@...radead.org>, "juri.lelli@...hat.com" <juri.lelli@...hat.com>, "rostedt@...dmis.org" <rostedt@...dmis.org>, "bsegall@...gle.com" <bsegall@...gle.com>, "mgorman@...e.de" <mgorman@...e.de>, "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org> Subject: Re: [PATCH] sched/fair: reduce preemption with IDLE tasks runable(Internet mail) On Thu, 20 Aug 2020 at 13:28, benbjiang(蒋彪) <benbjiang@...cent.com> wrote: > > > > > On Aug 20, 2020, at 3:35 PM, Vincent Guittot <vincent.guittot@...aro.org> wrote: > > > > On Thu, 20 Aug 2020 at 02:13, benbjiang(蒋彪) <benbjiang@...cent.com> wrote: > >> > >> > >> > >>> On Aug 19, 2020, at 10:55 PM, Vincent Guittot <vincent.guittot@...aro.org> wrote: > >>> > >>> On Wed, 19 Aug 2020 at 16:27, benbjiang(蒋彪) <benbjiang@...cent.com> wrote: > >>>> > >>>> > >>>> > >>>>> On Aug 19, 2020, at 7:55 PM, Dietmar Eggemann <dietmar.eggemann@....com> wrote: > >>>>> > >>>>> On 19/08/2020 13:05, Vincent Guittot wrote: > >>>>>> On Wed, 19 Aug 2020 at 12:46, Dietmar Eggemann <dietmar.eggemann@....com> wrote: > >>>>>>> > >>>>>>> On 17/08/2020 14:05, benbjiang(蒋彪) wrote: > >>>>>>>> > >>>>>>>> > >>>>>>>>> On Aug 17, 2020, at 4:57 PM, Dietmar Eggemann <dietmar.eggemann@....com> wrote: > >>>>>>>>> > >>>>>>>>> On 14/08/2020 01:55, benbjiang(蒋彪) wrote: > >>>>>>>>>> Hi, > >>>>>>>>>> > >>>>>>>>>>> On Aug 13, 2020, at 2:39 AM, Dietmar Eggemann <dietmar.eggemann@....com> wrote: > >>>>>>>>>>> > >>>>>>>>>>> On 12/08/2020 05:19, benbjiang(蒋彪) wrote: > >>>>>>>>>>>> Hi, > >>>>>>>>>>>> > >>>>>>>>>>>>> On Aug 11, 2020, at 11:54 PM, Dietmar Eggemann <dietmar.eggemann@....com> wrote: > >>>>>>>>>>>>> > >>>>>>>>>>>>> On 11/08/2020 02:41, benbjiang(蒋彪) wrote: > >>>>>>>>>>>>>> Hi, > >>>>>>>>>>>>>> > >>>>>>>>>>>>>>> On Aug 10, 2020, at 9:24 PM, Dietmar Eggemann <dietmar.eggemann@....com> wrote: > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> On 06/08/2020 17:52, benbjiang(蒋彪) wrote: > >>>>>>>>>>>>>>>> Hi, > >>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>> On Aug 6, 2020, at 9:29 PM, Dietmar Eggemann <dietmar.eggemann@....com> wrote: > >>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>> On 03/08/2020 13:26, benbjiang(蒋彪) wrote: > >>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>> On Aug 3, 2020, at 4:16 PM, Dietmar Eggemann <dietmar.eggemann@....com> wrote: > >>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>> On 01/08/2020 04:32, Jiang Biao wrote: > >>>>>>>>>>>>>>>>>>>> From: Jiang Biao <benbjiang@...cent.com> > >>>>>>> > >>>>>>> [...] > >>>>>>> > >>>>>>>>> Are you sure about this? > >>>>>>>> Yes. :) > >>>>>>>>> > >>>>>>>>> The math is telling me for the: > >>>>>>>>> > >>>>>>>>> idle task: (3 / (1024 + 1024 + 3))^(-1) * 4ms = 2735ms > >>>>>>>>> > >>>>>>>>> normal task: (1024 / (1024 + 1024 + 3))^(-1) * 4ms = 8ms > >>>>>>>>> > >>>>>>>>> (4ms - 250 Hz) > >>>>>>>> My tick is 1ms - 1000HZ, which seems reasonable for 600ms? :) > >>>>>>> > >>>>>>> OK, I see. > >>>>>>> > >>>>>>> But here the different sched slices (check_preempt_tick()-> > >>>>>>> sched_slice()) between normal tasks and the idle task play a role to. > >>>>>>> > >>>>>>> Normal tasks get ~3ms whereas the idle task gets <0.01ms. > >>>>>> > >>>>>> In fact that depends on the number of CPUs on the system > >>>>>> :sysctl_sched_latency = 6ms * (1 + ilog(ncpus)) . On a 8 cores system, > >>>>>> normal task will run around 12ms in one shoot and the idle task still > >>>>>> one tick period > >>>>> > >>>>> True. This is on a single CPU. > >>>> Agree. :) > >>>> > >>>>> > >>>>>> Also, you can increase even more the period between 2 runs of idle > >>>>>> task by using cgroups and min shares value : 2 > >>>>> > >>>>> Ah yes, maybe this is what Jiang wants to do then? If his runtime does > >>>>> not have other requirements preventing this. > >>>> That could work for increasing the period between 2 runs. But could not > >>>> reduce the single runtime of idle task I guess, which means normal task > >>>> could have 1-tick schedule latency because of idle task. > >>> > >>> Yes. An idle task will preempt an always running task during 1 tick > >>> every 680ms. But also you should keep in mind that a waking normal > >>> task will preempt the idle task immediately which means that it will > >>> not add scheduling latency to a normal task but "steal" 0.14% of > >>> normal task throughput (1/680) at most > >> That’s true. But in the VM case, when VM are busy(MWAIT passthrough > >> or running cpu eating works), the 1-tick scheduling latency could be > >> detected by cyclictest running in the VM. > >> > >> OTOH, we compensate vruntime in place_entity() to boot waking > >> without distinguish SCHED_IDLE task, do you think it’s necessary to > >> do that? like > >> > >> --- a/kernel/sched/fair.c > >> +++ b/kernel/sched/fair.c > >> @@ -4115,7 +4115,7 @@ place_entity(struct cfs_rq *cfs_rq, struct sched_entity *se, int initial) > >> vruntime += sched_vslice(cfs_rq, se); > >> > >> /* sleeps up to a single latency don't count. */ > >> - if (!initial) { > >> + if (!initial && likely(!task_has_idle_policy(task_of(se)))) { > >> unsigned long thresh = sysctl_sched_latency; > > > > Yeah, this is a good improvement. > Thanks, I’ll send a patch for that. :) > > > Does it solve your problem ? > > > Not exactly. :) I wonder if we can make SCHED_IDLE more pure(harmless)? We can't prevent it from running time to time. Proxy execution feature could be a step for considering to relax this constraint > Or introduce a switch(or flag) to control it, and make it available for cases like us. > > Thanks a lot. > Regards, > Jiang > > >> > >>> > >>>> OTOH, cgroups(shares) could introduce extra complexity. :) > >>>> > >>>> I wonder if there’s any possibility to make SCHED_IDLEs’ priorities absolutely > >>>> lower than SCHED_NORMAL(OTHER), which means no weights/shares > >>>> for them, and they run only when no other task’s runnable. > >>>> I guess there may be priority inversion issue if we do that. But maybe we > >>> > >>> Exactly, that's why we must ensure a minimum running time for sched_idle task > >> > >> Still for VM case, different VMs have been much isolated from each other, > >> priority inversion issue could be very rare, we’re trying to make offline tasks > >> absoultly harmless to online tasks. :) > >> > >> Thanks a lot for your time. > >> Regards, > >> Jiang > >> > >>> > >>>> could avoid it by load-balance more aggressively, or it(priority inversion) > >>>> could be ignored in some special case. >
Powered by blists - more mailing lists