lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Fri, 23 Oct 2020 13:57:24 -0400
From:   Joel Fernandes <joel@...lfernandes.org>
To:     Peter Zijlstra <peterz@...radead.org>
Cc:     Nishanth Aravamudan <naravamudan@...italocean.com>,
        Julien Desfossez <jdesfossez@...italocean.com>,
        Tim Chen <tim.c.chen@...ux.intel.com>,
        Vineeth Pillai <viremana@...ux.microsoft.com>,
        Aaron Lu <aaron.lwe@...il.com>,
        Aubrey Li <aubrey.intel@...il.com>, tglx@...utronix.de,
        linux-kernel@...r.kernel.org, mingo@...nel.org,
        torvalds@...ux-foundation.org, fweisbec@...il.com,
        keescook@...omium.org, kerrnel@...gle.com,
        Phil Auld <pauld@...hat.com>,
        Valentin Schneider <valentin.schneider@....com>,
        Mel Gorman <mgorman@...hsingularity.net>,
        Pawan Gupta <pawan.kumar.gupta@...ux.intel.com>,
        Paolo Bonzini <pbonzini@...hat.com>, vineeth@...byteword.org,
        Chen Yu <yu.c.chen@...el.com>,
        Christian Brauner <christian.brauner@...ntu.com>,
        Agata Gruza <agata.gruza@...el.com>,
        Antonio Gomez Iglesias <antonio.gomez.iglesias@...el.com>,
        graf@...zon.com, konrad.wilk@...cle.com, dfaggioli@...e.com,
        pjt@...gle.com, rostedt@...dmis.org, derkling@...gle.com,
        benbjiang@...cent.com,
        Alexandre Chartre <alexandre.chartre@...cle.com>,
        James.Bottomley@...senpartnership.com, OWeisse@...ch.edu,
        Dhaval Giani <dhaval.giani@...cle.com>,
        Junaid Shahid <junaids@...gle.com>, jsbarnes@...gle.com,
        chris.hyser@...cle.com,
        Vineeth Remanan Pillai <vpillai@...italocean.com>,
        Aaron Lu <aaron.lu@...ux.alibaba.com>,
        Aubrey Li <aubrey.li@...ux.intel.com>,
        "Paul E. McKenney" <paulmck@...nel.org>,
        Tim Chen <tim.c.chen@...el.com>
Subject: Re: [PATCH v8 -tip 06/26] sched: Add core wide task selection and
 scheduling.

On Fri, Oct 23, 2020 at 03:54:00PM +0200, Peter Zijlstra wrote:
> On Fri, Oct 23, 2020 at 03:51:29PM +0200, Peter Zijlstra wrote:
> > On Mon, Oct 19, 2020 at 09:43:16PM -0400, Joel Fernandes (Google) wrote:
> > > +			/*
> > > +			 * If this sibling doesn't yet have a suitable task to
> > > +			 * run; ask for the most elegible task, given the
> > > +			 * highest priority task already selected for this
> > > +			 * core.
> > > +			 */
> > > +			p = pick_task(rq_i, class, max);
> > > +			if (!p) {
> > > +				/*
> > > +				 * If there weren't no cookies; we don't need to
> > > +				 * bother with the other siblings.
> > > +				 * If the rest of the core is not running a tagged
> > > +				 * task, i.e.  need_sync == 0, and the current CPU
> > > +				 * which called into the schedule() loop does not
> > > +				 * have any tasks for this class, skip selecting for
> > > +				 * other siblings since there's no point. We don't skip
> > > +				 * for RT/DL because that could make CFS force-idle RT.
> > > +				 */
> > > +				if (i == cpu && !need_sync && class == &fair_sched_class)
> > > +					goto next_class;
> > > +
> > > +				continue;
> > > +			}
> > 
> > I'm failing to understand the class == &fair_sched_class bit.

The last line in the comment explains it "We don't skip for RT/DL because
that could make CFS force-idle RT.".

Even if need_sync == false, we need to go look at other CPUs (non-local
CPUs) to see if they could be running RT.

Say the RQs in a particular core look like this:
Let CFS1 and CFS2 be 2 tagged CFS tags. Let RT1 be an untagged RT task.

rq0	       rq1
CFS1 (tagged)  RT1 (not tag)
CFS2 (tagged)

Say schedule() runs on rq0. Now, it will enter the above loop and
pick_task(RT) will return NULL for 'p'. It will enter the above if() block
and see that need_sync == false and will skip RT entirely.

The end result of the selection will be (say prio(CFS1) > prio(CFS2)):
rq0		rq1
CFS1		IDLE

When it should have selected:
rq0		r1
IDLE		RT

I saw this issue on real-world usecases in ChromeOS where an RT task gets
constantly force-idled and breaks RT. The "class == &fair_sched_class" bit
cures it.

> > > +                          * for RT/DL because that could make CFS force-idle RT.
> > IIRC the condition is such that the core doesn't have a cookie (we don't
> > need to sync the threads) so we'll only do a pick for our local CPU.
> > 
> > That should be invariant of class.
> 
> That is; it should be the exact counterpart of this bit:
> 
> > +			/*
> > +			 * Optimize the 'normal' case where there aren't any
> > +			 * cookies and we don't need to sync up.
> > +			 */
> > +			if (i == cpu && !need_sync && !p->core_cookie) {
> > +				next = p;
> > +				goto done;
> > +			}
> 
> If there is no task found in this class, try the next class, if there
> is, we done.

That's Ok. But we cannot skip RT class on other CPUs.

thanks,

 - Joel

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ