[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20160509011311.GQ16093@intel.com>
Date: Mon, 9 May 2016 09:13:11 +0800
From: Yuyang Du <yuyang.du@...el.com>
To: Mike Galbraith <mgalbraith@...e.de>
Cc: Peter Zijlstra <peterz@...radead.org>, Chris Mason <clm@...com>,
Ingo Molnar <mingo@...nel.org>,
Matt Fleming <matt@...eblueprint.co.uk>,
linux-kernel@...r.kernel.org
Subject: Re: sched: tweak select_idle_sibling to look for idle threads
On Mon, May 09, 2016 at 09:44:13AM +0200, Mike Galbraith wrote:
> > Then a valid question is whether it is this selection screwed up in case
> > like this, as it should necessarily always be asked.
>
> That's a given, it's just a question of how to do a bit better cheaply.
>
> > > > Regarding wake_wide(), it seems the M:N is 1:24, not 6:6*24, if so,
> > > > the slave will be 0 forever (as last_wakee is never flipped).
> > >
> > > Yeah, it's irrelevant here, this load is all about instantaneous state.
> > > I could use a bit more of that, reserving on the wakeup side won't
> > > help this benchmark until everything else cares. One stack, and it's
> > > game over. It could help generic utilization and latency some.. but it
> > > seems kinda unlikely it'll be worth the cycle expenditure.
> >
> > Yes and no, it depends on how efficient work-stealing is, compared to
> > selection, but remember, at the end of the day, the wakee CPU measures the
> > latency, that CPU does not care it is selected or it steals.
>
> In a perfect world, running only Chris' benchmark on an otherwise idle
> box, there would never _be_ any work to steal.
What is the perfect world like? I don't get what you mean.
> In the real world, we
> smooth utilization, optimistically peek at this/that, and intentionally
> throttle idle balancing (etc etc), which adds up to an imperfect world
> for this (based on real world load) benchmark.
So, is this a shout-out: these parts should be coordinated better?
> > En... should we try remove recording last_wakee?
>
> The more the merrier, go for it! :)
Nuh, really, this heuristic is too heuristic, :)
The totality of all possible cases is scary.
Just for a general M:N two-way waker-wakee relationship, not recording
last_wakee may work well generally. E.g., currently, on a 2-socket (24-thread
per socket) 1:24 and 1:48 can't really be differentiated, whereas 1:24 and
2:48 are completely different.
Am I understanding correctly?
Powered by blists - more mailing lists