linux-kernel - Re: sched: tweak select_idle_sibling to look for idle threads

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Date:	Mon, 09 May 2016 05:45:40 +0200
From:	Mike Galbraith <mgalbraith@...e.de>
To:	Yuyang Du <yuyang.du@...el.com>
Cc:	Peter Zijlstra <peterz@...radead.org>, Chris Mason <clm@...com>,
	Ingo Molnar <mingo@...nel.org>,
	Matt Fleming <matt@...eblueprint.co.uk>,
	linux-kernel@...r.kernel.org
Subject: Re: sched: tweak select_idle_sibling to look for idle threads

On Mon, 2016-05-09 at 02:57 +0800, Yuyang Du wrote:
> On Sun, May 08, 2016 at 10:08:55AM +0200, Mike Galbraith wrote:
> > > Maybe give the criteria a bit margin, not just wakees tend to equal llc_size,
> > > but the numbers are so wild to easily break the fragile condition, like:
> > 
> > Seems lockless traversal and averages just lets multiple CPUs select
> > the same spot.  An atomic reservation (feature) when looking for an
> > idle spot (also for fork) might fix it up.  Run the thing as RT,
> > push/pull ensures that it reaches box saturation regardless of the
> > number of messaging threads, whereas with fair class, any number > 1
> > will certainly stack tasks before the box is saturated.
> 
> Yes, good idea, bringing order to the race to grab idle CPU is absolutely
> helpful.

Well, good ideas work, as yet this one helps jack diddly spit.

> In addition, I would argue maybe beefing up idle balancing is a more
> productive way to spread load, as work-stealing just does what needs
> to be done. And seems it has been (sub-unconsciously) neglected in this
> case, :)
> 
> Regarding wake_wide(), it seems the M:N is 1:24, not 6:6*24, if so,
> the slave will be 0 forever (as last_wakee is never flipped).

Yeah, it's irrelevant here, this load is all about instantaneous state.
 I could use a bit more of that, reserving on the wakeup side won't
help this benchmark until everything else cares.  One stack, and it's
game over.  It could help generic utilization and latency some.. but it
seems kinda unlikely it'll be worth the cycle expenditure.

> Basically whenever a waker has more than 1 wakee, the wakee_flips
> will comfortably grow very large (with last_wakee alternating),
> whereas when a waker has 0 or 1 wakee, the wakee_flips will just be 0.

Yup, it is a heuristic, and like all of those, imperfect.  I've watched
it improving utilization in the wild though, so won't mind that until I
catch it doing really bad things.

> So recording only the last_wakee seems not right unless you have other
> good reason. If not the latter, counting waking wakee times should be
> better, and then allow the statistics to happily play.