lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Tue, 17 Jul 2018 00:40:19 +1000
From:   Michael Ellerman <mpe@...erman.id.au>
To:     Peter Zijlstra <peterz@...radead.org>
Cc:     Linus Torvalds <torvalds@...ux-foundation.org>,
        Paul McKenney <paulmck@...ux.vnet.ibm.com>,
        Alan Stern <stern@...land.harvard.edu>,
        andrea.parri@...rulasolutions.com,
        Will Deacon <will.deacon@....com>,
        Akira Yokosawa <akiyks@...il.com>,
        Boqun Feng <boqun.feng@...il.com>,
        Daniel Lustig <dlustig@...dia.com>,
        David Howells <dhowells@...hat.com>,
        Jade Alglave <j.alglave@....ac.uk>,
        Luc Maranget <luc.maranget@...ia.fr>,
        Nick Piggin <npiggin@...il.com>,
        Linux Kernel Mailing List <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH v2] tools/memory-model: Add extra ordering for locks and remove it for ordinary release/acquire

Peter Zijlstra <peterz@...radead.org> writes:
> On Fri, Jul 13, 2018 at 11:15:26PM +1000, Michael Ellerman wrote:
...
>> 
>> 
>> So 18-32% slower, or 23-47 cycles.
>
> Very good info. Note that another option is to put the SYNC in lock() it
> doesn't really matter which of the two primitives gets it. I don't
> suppose it really matters for timing either way around.

If the numbers can be trusted it is actually slower to put the sync in
lock, at least on one of the machines:

              Time
lwsync_sync   84,932,987,977
sync_lwsync   93,185,930,333

On the other machine it's slower but only by 0.1%, so that's slightly
weird.

The other advantage of putting the sync in unlock is we could get rid of
our SYNC_IO logic, which conditionally puts a sync in unlock to order
IO accesses vs unlock.


>> Next week I can do some macro benchmarks, to see if it's actually
>> detectable at all.


I guess arguably it's not a very macro benchmark, but we have a
context_switch benchmark in the tree[1] which we often use to tune
things, and it degrades badly. It just spins up two threads and has them
ping-pong using yield.


The numbers are context switch iterations, so more == better.

	   | Before     | After      | Change     | Change %
	   +------------+------------+------------+----------
	   | 35,601,160 | 32,371,164 | -3,229,996 | -9.07%
	   | 35,762,126 | 32,438,798 | -3,323,328 | -9.29%
	   | 35,690,870 | 32,353,676 | -3,337,194 | -9.35%
	   | 35,440,346 | 32,336,750 | -3,103,596 | -8.76%
	   | 35,614,868 | 32,676,378 | -2,938,490 | -8.25%
	   | 35,659,690 | 32,462,624 | -3,197,066 | -8.97%
	   | 35,594,058 | 32,403,922 | -3,190,136 | -8.96%
	   | 35,682,682 | 32,353,146 | -3,329,536 | -9.33%
	   | 35,954,454 | 32,306,168 | -3,648,286 | -10.15%
	   | 35,849,314 | 32,291,094 | -3,558,220 | -9.93%
 ----------+------------+------------+------------+----------
 Average   | 35,684,956 | 32,399,372 | -3,285,584 | -9.21%
 Std Dev   |    143,877 |    111,385 |
 Std Dev % |      0.40% |      0.34% |

[1]: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/tools/testing/selftests/powerpc/benchmarks/context_switch.c


I'll do some kernbench runs tomorrow and see if it shows up there.


>> My personal preference would be to switch to sync, we don't want to be
>> the only arch finding (or not finding!) exotic ordering bugs.
>> 
>> But we'd also rather not make our slow locks any slower than they have
>> to be.
>
> I completely understand, but I'll get you beer (lots) if you do manage
> to make SYNC happen :-) :-)

Just so we're clear Fosters is not beer :)

cheers

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ