lists.openwall.net | lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC | |
Open Source and information security mailing list archives
| ||
|
Date: Thu, 5 Nov 2020 17:07:43 -0500 From: Joel Fernandes <joel@...lfernandes.org> To: Peter Zijlstra <peterz@...radead.org> Cc: Nishanth Aravamudan <naravamudan@...italocean.com>, Julien Desfossez <jdesfossez@...italocean.com>, Tim Chen <tim.c.chen@...ux.intel.com>, Vineeth Pillai <viremana@...ux.microsoft.com>, Aaron Lu <aaron.lwe@...il.com>, Aubrey Li <aubrey.intel@...il.com>, tglx@...utronix.de, linux-kernel@...r.kernel.org, mingo@...nel.org, torvalds@...ux-foundation.org, fweisbec@...il.com, keescook@...omium.org, kerrnel@...gle.com, Phil Auld <pauld@...hat.com>, Valentin Schneider <valentin.schneider@....com>, Mel Gorman <mgorman@...hsingularity.net>, Pawan Gupta <pawan.kumar.gupta@...ux.intel.com>, Paolo Bonzini <pbonzini@...hat.com>, vineeth@...byteword.org, Chen Yu <yu.c.chen@...el.com>, Christian Brauner <christian.brauner@...ntu.com>, Agata Gruza <agata.gruza@...el.com>, Antonio Gomez Iglesias <antonio.gomez.iglesias@...el.com>, graf@...zon.com, konrad.wilk@...cle.com, dfaggioli@...e.com, pjt@...gle.com, rostedt@...dmis.org, derkling@...gle.com, benbjiang@...cent.com, Alexandre Chartre <alexandre.chartre@...cle.com>, James.Bottomley@...senpartnership.com, OWeisse@...ch.edu, Dhaval Giani <dhaval.giani@...cle.com>, Junaid Shahid <junaids@...gle.com>, jsbarnes@...gle.com, chris.hyser@...cle.com, Vineeth Remanan Pillai <vpillai@...italocean.com>, Aaron Lu <aaron.lu@...ux.alibaba.com>, Aubrey Li <aubrey.li@...ux.intel.com>, Tim Chen <tim.c.chen@...el.com> Subject: Re: [PATCH v8 -tip 06/26] sched: Add core wide task selection and scheduling. On Thu, Nov 05, 2020 at 01:50:19PM -0500, Joel Fernandes wrote: > On Mon, Oct 26, 2020 at 10:31:31AM +0100, Peter Zijlstra wrote: > > On Fri, Oct 23, 2020 at 05:31:18PM -0400, Joel Fernandes wrote: > > > On Fri, Oct 23, 2020 at 09:26:54PM +0200, Peter Zijlstra wrote: > > > > > > How about this then? > > > > > > This does look better. It makes sense and I think it will work. I will look > > > more into it and also test it. > > > > Hummm... Looking at it again I wonder if I can make something like the > > below work. > > > > (depends on the next patch that pulls core_forceidle into core-wide > > state) > > > > That would retain the CFS-cgroup optimization as well, for as long as > > there's no cookies around. > > > > --- a/kernel/sched/core.c > > +++ b/kernel/sched/core.c > > @@ -4691,8 +4691,6 @@ pick_next_task(struct rq *rq, struct tas > > return next; > > } > > > > - put_prev_task_balance(rq, prev, rf); > > - > > smt_mask = cpu_smt_mask(cpu); > > > > /* > > @@ -4707,14 +4705,25 @@ pick_next_task(struct rq *rq, struct tas > > */ > > rq->core->core_task_seq++; > > need_sync = !!rq->core->core_cookie; > > - > > - /* reset state */ > > -reset: > > - rq->core->core_cookie = 0UL; > > if (rq->core->core_forceidle) { > > need_sync = true; > > rq->core->core_forceidle = false; > > } > > + > > + if (!need_sync) { > > + next = __pick_next_task(rq, prev, rf); > > This could end up triggering pick_next_task_fair's newidle balancing; > > > + if (!next->core_cookie) { > > + rq->core_pick = NULL; > > + return next; > > + } > > .. only to realize here that pick_next_task_fair() that we have to put_prev > the task back as it has a cookie, but the effect of newidle balancing cannot > be reverted. > > Would that be a problem as the newly pulled task might be incompatible and > would have been better to leave it alone? > > TBH, this is a drastic change and we've done a lot of testing with the > current code and its looking good. I'm a little scared of changing it right > now and introducing regression. Can we maybe do this after the existing > patches are upstream? After sleeping over it, I am trying something like the following. Thoughts? Basically, I call pick_task() in advance. That's mostly all that's different with your patch: ---8<----------------------- diff --git a/kernel/sched/core.c b/kernel/sched/core.c index 0ce17aa72694..366e5ed84a63 100644 --- a/kernel/sched/core.c +++ b/kernel/sched/core.c @@ -5000,28 +5000,34 @@ pick_next_task(struct rq *rq, struct task_struct *prev, struct rq_flags *rf) put_prev_task_balance(rq, prev, rf); smt_mask = cpu_smt_mask(cpu); - - /* - * core->core_task_seq, core->core_pick_seq, rq->core_sched_seq - * - * @task_seq guards the task state ({en,de}queues) - * @pick_seq is the @task_seq we did a selection on - * @sched_seq is the @pick_seq we scheduled - * - * However, preemptions can cause multiple picks on the same task set. - * 'Fix' this by also increasing @task_seq for every pick. - */ - rq->core->core_task_seq++; need_sync = !!rq->core->core_cookie; /* reset state */ -reset: rq->core->core_cookie = 0UL; if (rq->core->core_forceidle) { need_sync = true; fi_before = true; rq->core->core_forceidle = false; } + + /* + * Optimize for common case where this CPU has no cookies + * and there are no cookied tasks running on siblings. + */ + if (!need_sync) { + for_each_class(class) { + next = class->pick_task(rq); + if (next) + break; + } + + if (!next->core_cookie) { + rq->core_pick = NULL; + goto done; + } + need_sync = true; + } + for_each_cpu(i, smt_mask) { struct rq *rq_i = cpu_rq(i); @@ -5039,6 +5045,18 @@ pick_next_task(struct rq *rq, struct task_struct *prev, struct rq_flags *rf) } } + /* + * core->core_task_seq, core->core_pick_seq, rq->core_sched_seq + * + * @task_seq guards the task state ({en,de}queues) + * @pick_seq is the @task_seq we did a selection on + * @sched_seq is the @pick_seq we scheduled + * + * However, preemptions can cause multiple picks on the same task set. + * 'Fix' this by also increasing @task_seq for every pick. + */ + rq->core->core_task_seq++; + /* * Try and select tasks for each sibling in decending sched_class * order. @@ -5059,40 +5077,8 @@ pick_next_task(struct rq *rq, struct task_struct *prev, struct rq_flags *rf) * core. */ p = pick_task(rq_i, class, max); - if (!p) { - /* - * If there weren't no cookies; we don't need to - * bother with the other siblings. - */ - if (i == cpu && !need_sync) - goto next_class; - + if (!p) continue; - } - - /* - * Optimize the 'normal' case where there aren't any - * cookies and we don't need to sync up. - */ - if (i == cpu && !need_sync) { - if (p->core_cookie) { - /* - * This optimization is only valid as - * long as there are no cookies - * involved. We may have skipped - * non-empty higher priority classes on - * siblings, which are empty on this - * CPU, so start over. - */ - need_sync = true; - goto reset; - } - - next = p; - trace_printk("unconstrained pick: %s/%d %lx\n", - next->comm, next->pid, next->core_cookie); - goto done; - } if (!is_task_rq_idle(p)) occ++;
Powered by blists - more mailing lists