lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Tue, 9 Feb 2016 12:23:58 +0100
From:	Ingo Molnar <mingo@...nel.org>
To:	Will Deacon <will.deacon@....com>
Cc:	Linus Torvalds <torvalds@...ux-foundation.org>,
	Boqun Feng <boqun.feng@...il.com>,
	Paul McKenney <paulmck@...ux.vnet.ibm.com>,
	Peter Zijlstra <peterz@...radead.org>,
	"Maciej W. Rozycki" <macro@...tec.com>,
	David Daney <ddaney@...iumnetworks.com>,
	Måns Rullgård <mans@...sr.com>,
	Ralf Baechle <ralf@...ux-mips.org>,
	Linux Kernel Mailing List <linux-kernel@...r.kernel.org>
Subject: Re: [RFC][PATCH] mips: Fix arch_spin_unlock()


* Will Deacon <will.deacon@....com> wrote:

> On Wed, Feb 03, 2016 at 01:32:10PM +0000, Will Deacon wrote:
> > On Wed, Feb 03, 2016 at 09:33:39AM +0100, Ingo Molnar wrote:
> > > In fact I'd suggest to test this via a quick runtime hack like this in rcupdate.h:
> > > 
> > > 	extern int panic_timeout;
> > > 
> > > 	...
> > > 
> > > 	if (panic_timeout)
> > > 		smp_load_acquire(p);
> > > 	else
> > > 		typeof(*p) *________p1 = (typeof(*p) *__force)lockless_dereference(p);
> > > 
> > > (or so)
> > 
> > So the problem with this is that a LOAD <ctrl> LOAD sequence isn't an
> > ordering hazard on ARM, so you're potentially at the mercy of the branch
> > predictor as to whether you get an acquire. That's not to say it won't
> > be discarded as soon as the conditional is resolved, but it could
> > screw up the benchmarking.
> > 
> > I'd be better off doing some runtime patching, but that's not something
> > I can knock up in a couple of minutes (so I'll add it to my list).
> 
> ... so I actually got that up and running, believe it or not. Filthy stuff.

Wow!

I tried to implement the simpler solution by hacking rcupdate.h, but got drowned 
in nasty circular header file dependencies and gave up...

If you are not overly embarrassed by posting hacky patches, mind posting your 
solution?

> The good news is that you're right, and I'm now seeing ~1% difference between 
> the runs with ~0.3% noise for either of them. I still think that's significant, 
> but it's a lot more reassuring than 4%.

hm, so for such marginal effects I think we could improve the testing method a 
bit: we could improve 'perf bench sched messaging' to allow 'steady state 
testing': to not exit+restart all the processes between test iterations, but to 
continuously measure and print out current performance figures.

I.e. every 10 seconds it could print a decaying running average of current 
throughput.

That way you could patch/unpatch the instructions without having to restart the 
tasks. If you still see an effect (in the numbers reported every 10 seconds), then 
that's a guaranteed result.

[ We have such functionality in 'perf bench numa' (the --show-convergence option), 
  for similar reasons, to allow runtime monitoring and tweaking of kernel 
  parameters. ]

Thanks,

	Ingo

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ