linux-kernel - Re: [PATCH 3/3] sched: terminate newidle balancing once atleastone task has moved over

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Message-Id: <1214336677.16881.1.camel@twins>
Date:	Tue, 24 Jun 2008 21:44:37 +0200
From:	Peter Zijlstra <peterz@...radead.org>
To:	Gregory Haskins <ghaskins@...ell.com>
Cc:	mingo@...e.hu, rostedt@...dmis.org, tglx@...utronix.de,
	David Bahi <DBahi@...ell.com>, linux-kernel@...r.kernel.org,
	linux-rt-users@...r.kernel.org
Subject: Re: [PATCH 3/3] sched: terminate newidle balancing once atleastone
	task has moved over

On Tue, 2008-06-24 at 10:55 -0600, Gregory Haskins wrote:
> >>> On Tue, Jun 24, 2008 at  9:31 AM, in message <1214314273.4351.34.camel@...ns>,
> Peter Zijlstra <peterz@...radead.org> wrote: 
> > On Tue, 2008-06-24 at 07:18 -0600, Gregory Haskins wrote:
> >> >>> On Tue, Jun 24, 2008 at  6:13 AM, in message 
> > <1214302406.4351.23.camel@...ns>,
> >> Peter Zijlstra <peterz@...radead.org> wrote: 
> >> > On Mon, 2008-06-23 at 17:04 -0600, Gregory Haskins wrote:
> >> >> Inspired by Peter Zijlstra.
> >> >> 
> >> >> Signed-off-by: Gregory Haskins <ghaskins@...ell.com>
> >> >> ---
> >> >> 
> >> >>  kernel/sched.c |    4 ++++
> >> >>  1 files changed, 4 insertions(+), 0 deletions(-)
> >> >> 
> >> >> diff --git a/kernel/sched.c b/kernel/sched.c
> >> >> index 3efbbc5..c8e8520 100644
> >> >> --- a/kernel/sched.c
> >> >> +++ b/kernel/sched.c
> >> >> @@ -2775,6 +2775,10 @@ static int move_tasks(struct rq *this_rq, int 
> >> > this_cpu, struct rq *busiest,
> >> >>  				max_load_move - total_load_moved,
> >> >>  				sd, idle, all_pinned, &this_best_prio);
> >> >>  		class = class->next;
> >> >> +
> >> >> +		if (idle == CPU_NEWLY_IDLE && this_rq->nr_running)
> >> >> +			break;
> >> >> +
> >> >>  	} while (class && max_load_move > total_load_moved);
> >> >>  
> >> >>  	return total_load_moved > 0;
> >> > 
> >> > 
> >> > right,.. uhm, except that you forgot all the other fixes and
> >> > generalizations I had,..
> >> 
> >> Heh...well I intentionally simplified it, but perhaps that is out of 
> > ignorance.  I did say "inspired by" ;)
> >> 
> >> > 
> >> > The LB_START/LB_COMPLETE stuff is needed to fix CFS load balancing. It
> >> > now always iterates the first sysctl_sched_nr_migrate tasks, and if it
> >> > doesn't find any there, just gives up - which isn't too big of a problem
> >> > with it set to 32, but if you drop it to 2/4 stuff starts valing apart.
> >> > 
> >> > And the break I had here, only checks classes above and equal to the
> >> > current class.
> >> > 
> >> > This again is needed when you have more classes.
> >> 
> >> Im not sure I understand/agree here (unless you plan on having a class below 
> > sched_idle()??)
> >> 
> >> The fact that we are going NEWLYIDLE to me implies that all the other 
> > classes are
> >> "above or equal".  And rq->nr_running approximates all the per-class vtable 
> > work
> >> that you had done to probe the higher classes.  We currently only hit this 
> > code when
> >> rq->nr_running == 0, so rq->nr_running !=0 seems like a logical termination
> >> condition.
> >> 
> >> I guess what I am not clear on is: "when would we be NEWLYIDLE in a higher 
> > class,
> >> yet have tasks populated in lower classes such at nr_running is non-zero".
> >> Additionally, even if we have that condition (e.g. with something like the 
> > EDF work you
> >> are doing, perhaps?), shouldn't we patch the advanced form of this logic 
> > when the rest
> >> of the code goes in?  For now, this seems like the most straight forward way 
> > to
> >> accomplish the goal.  But I could be missing something ;)
> > 
> > The thing I'm worried about - but it might be unfounded and is certainly
> > so now - is that suppose we have:
> > 
> >   EDF
> >   FIFO/RR
> >   SOFTRT
> >   OTHER
> >   IDLE
> > 
> > and we've just done FIFO/RR (which is a nop) and and some interrupt woke
> > an OTHER task while we dropped for lockbreak.
> > 
> > At this point your logic would bail out and start running the OTHER
> > task, even though we might have found a SOFTRQ task to run had we
> > bothered to look.
> > 
> 
> Ok, now I think I understand your concern.  But I think you may be worrying about
> this at the wrong level.  I would think we should be doing something similar to the
> post-balance patch I submitted a while back.  It basically iterated through each class,
> giving each an opportunity to pull tasks over in its own way.  The difference there
> was that I was doing it post-schedule to deal with that locking issue.  We could
> take the same idea and do it where we pre_schedule() today.
> 
> I think the f_b_g() et. al. is really SCHED_OTHER specific, and probably always will be.
> Lets just formalize that.  Perhaps we should move all the LB code to sched_fair and set
> something like what I proposed up.  Thoughts?

Right,. generalizing f_b_g() isn't something we should consider, its
plenty impossible to understand already.

OK, moving everything into _fair sounds like the right approach.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/