lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <1462779853.3803.128.camel@suse.de>
Date:	Mon, 09 May 2016 09:44:13 +0200
From:	Mike Galbraith <mgalbraith@...e.de>
To:	Yuyang Du <yuyang.du@...el.com>
Cc:	Peter Zijlstra <peterz@...radead.org>, Chris Mason <clm@...com>,
	Ingo Molnar <mingo@...nel.org>,
	Matt Fleming <matt@...eblueprint.co.uk>,
	linux-kernel@...r.kernel.org
Subject: Re: sched: tweak select_idle_sibling to look for idle threads

On Mon, 2016-05-09 at 04:22 +0800, Yuyang Du wrote:
> On Mon, May 09, 2016 at 05:45:40AM +0200, Mike Galbraith wrote:
> > On Mon, 2016-05-09 at 02:57 +0800, Yuyang Du wrote:
> > > On Sun, May 08, 2016 at 10:08:55AM +0200, Mike Galbraith wrote:
> > > > > Maybe give the criteria a bit margin, not just wakees tend to equal llc_size,
> > > > > but the numbers are so wild to easily break the fragile condition, like:
> > > > 
> > > > Seems lockless traversal and averages just lets multiple CPUs select
> > > > the same spot.  An atomic reservation (feature) when looking for an
> > > > idle spot (also for fork) might fix it up.  Run the thing as RT,
> > > > push/pull ensures that it reaches box saturation regardless of the
> > > > number of messaging threads, whereas with fair class, any number > 1
> > > > will certainly stack tasks before the box is saturated.
> > > 
> > > Yes, good idea, bringing order to the race to grab idle CPU is absolutely
> > > helpful.
> > 
> > Well, good ideas work, as yet this one helps jack diddly spit.
> 
> Then a valid question is whether it is this selection screwed up in case
> like this, as it should necessarily always be asked.

That's a given, it's just a question of how to do a bit better cheaply.
 
> > > Regarding wake_wide(), it seems the M:N is 1:24, not 6:6*24, if so,
> > > the slave will be 0 forever (as last_wakee is never flipped).
> > 
> > Yeah, it's irrelevant here, this load is all about instantaneous state.
> >  I could use a bit more of that, reserving on the wakeup side won't
> > help this benchmark until everything else cares.  One stack, and it's
> > game over.  It could help generic utilization and latency some.. but it
> > seems kinda unlikely it'll be worth the cycle expenditure.
> 
> Yes and no, it depends on how efficient work-stealing is, compared to
> selection, but remember, at the end of the day, the wakee CPU measures the
> latency, that CPU does not care it is selected or it steals.

In a perfect world, running only Chris' benchmark on an otherwise idle
box, there would never _be_ any work to steal.  In the real world, we
smooth utilization, optimistically peek at this/that, and intentionally
throttle idle balancing (etc etc), which adds up to an imperfect world
for this (based on real world load) benchmark.

> En... should we try remove recording last_wakee?

The more the merrier, go for it! :)

	-Mike

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ