lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Thu, 28 Jun 2018 17:47:01 +0100
From:   Will Deacon <will.deacon@....com>
To:     Mathieu Desnoyers <mathieu.desnoyers@...icios.com>
Cc:     linux-arm-kernel <linux-arm-kernel@...ts.infradead.org>,
        linux-kernel <linux-kernel@...r.kernel.org>,
        Arnd Bergmann <arnd@...db.de>,
        Peter Zijlstra <peterz@...radead.org>,
        "Paul E. McKenney" <paulmck@...ux.vnet.ibm.com>,
        Boqun Feng <boqun.feng@...il.com>,
        Catalin Marinas <catalin.marinas@....com>,
        peter maydell <peter.maydell@...aro.org>,
        Mark Rutland <mark.rutland@....com>
Subject: Re: [PATCH 3/3] rseq/selftests: Add support for arm64

Hi Mathieu,

On Tue, Jun 26, 2018 at 12:11:52PM -0400, Mathieu Desnoyers wrote:
> ----- On Jun 26, 2018, at 11:14 AM, Will Deacon will.deacon@....com wrote:
> > On Mon, Jun 25, 2018 at 02:10:10PM -0400, Mathieu Desnoyers wrote:
> >> I notice you are using the instructions
> >> 
> >>   adrp
> >>   add
> >>   str
> >> 
> >> to implement RSEQ_ASM_STORE_RSEQ_CS(). Did you compare
> >> performance-wise with an approach using a literal pool
> >> near the instruction pointer like I did on arm32 ?
> > 
> > I didn't, no. Do you have a benchmark to hand so I can give this a go?
> 
> see tools/testing/selftests/rseq/param_test_benchmark --help
> 
> It's a stripped-down version of param_test, without all the code for
> delay loops and testing checks.
> 
> Example use for counter increment with 4 threads, doing 5G counter
> increments per thread:
> 
> time ./param_test_benchmark -T i -t 4 -r 5000000000

Thanks. I ran that on a few arm64 systems I have access to, with three
configurations of the selftest:

1. As I posted
2. With the abort signature and branch in-lined, so as to avoid the CBNZ
   address limitations in large codebases
3. With both the abort handler and the table inlined (i.e. the same thing
   as 32-bit).

There isn't a reliably measurable difference between (1) and (2), but I take
between 12% and 27% hit between (2) and (3).

So I'll post a v2 based on (2).

Will

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ