linux-kernel - Re: [RFC][PATCH] mips: Fix arch_spin

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20160209112358.GA500@gmail.com>
Date:	Tue, 9 Feb 2016 12:23:58 +0100
From:	Ingo Molnar <mingo@...nel.org>
To:	Will Deacon <will.deacon@....com>
Cc:	Linus Torvalds <torvalds@...ux-foundation.org>,
	Boqun Feng <boqun.feng@...il.com>,
	Paul McKenney <paulmck@...ux.vnet.ibm.com>,
	Peter Zijlstra <peterz@...radead.org>,
	"Maciej W. Rozycki" <macro@...tec.com>,
	David Daney <ddaney@...iumnetworks.com>,
	Måns Rullgård <mans@...sr.com>,
	Ralf Baechle <ralf@...ux-mips.org>,
	Linux Kernel Mailing List <linux-kernel@...r.kernel.org>
Subject: Re: [RFC][PATCH] mips: Fix arch_spin_unlock()

* Will Deacon <will.deacon@....com> wrote:

> On Wed, Feb 03, 2016 at 01:32:10PM +0000, Will Deacon wrote:
> > On Wed, Feb 03, 2016 at 09:33:39AM +0100, Ingo Molnar wrote:
> > > In fact I'd suggest to test this via a quick runtime hack like this in rcupdate.h:
> > > 
> > > 	extern int panic_timeout;
> > > 
> > > 	...
> > > 
> > > 	if (panic_timeout)
> > > 		smp_load_acquire(p);
> > > 	else
> > > 		typeof(*p) *________p1 = (typeof(*p) *__force)lockless_dereference(p);
> > > 
> > > (or so)
> > 
> > So the problem with this is that a LOAD <ctrl> LOAD sequence isn't an
> > ordering hazard on ARM, so you're potentially at the mercy of the branch
> > predictor as to whether you get an acquire. That's not to say it won't
> > be discarded as soon as the conditional is resolved, but it could
> > screw up the benchmarking.
> > 
> > I'd be better off doing some runtime patching, but that's not something
> > I can knock up in a couple of minutes (so I'll add it to my list).
> 
> ... so I actually got that up and running, believe it or not. Filthy stuff.

Wow!

I tried to implement the simpler solution by hacking rcupdate.h, but got drowned 
in nasty circular header file dependencies and gave up...

If you are not overly embarrassed by posting hacky patches, mind posting your 
solution?

> The good news is that you're right, and I'm now seeing ~1% difference between 
> the runs with ~0.3% noise for either of them. I still think that's significant, 
> but it's a lot more reassuring than 4%.

hm, so for such marginal effects I think we could improve the testing method a 
bit: we could improve 'perf bench sched messaging' to allow 'steady state 
testing': to not exit+restart all the processes between test iterations, but to 
continuously measure and print out current performance figures.

I.e. every 10 seconds it could print a decaying running average of current 
throughput.

That way you could patch/unpatch the instructions without having to restart the 
tasks. If you still see an effect (in the numbers reported every 10 seconds), then 
that's a guaranteed result.

[ We have such functionality in 'perf bench numa' (the --show-convergence option), 
  for similar reasons, to allow runtime monitoring and tweaking of kernel 
  parameters. ]

Thanks,

	Ingo