netdev - RE: [PATCH 3/3] powerpc: bpf: implement in-register swap for 64-bit endian operations

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <063D6719AE5E284EB5DD2968C1650D6DB026CE48@AcuExch.aculab.com>
Date:   Tue, 24 Jan 2017 16:13:52 +0000
From:   David Laight <David.Laight@...LAB.COM>
To:     "'Naveen N. Rao'" <naveen.n.rao@...ux.vnet.ibm.com>,
        "Benjamin Herrenschmidt" <benh@....ibm.com>
CC:     "netdev@...r.kernel.org" <netdev@...r.kernel.org>,
        "linuxppc-dev@...ts.ozlabs.org" <linuxppc-dev@...ts.ozlabs.org>,
        "davem@...emloft.net" <davem@...emloft.net>,
        "daniel@...earbox.net" <daniel@...earbox.net>,
        "ast@...com" <ast@...com>,
        Madhavan Srinivasan <maddy@...ux.vnet.ibm.com>,
        Michael Ellerman <mpe@...erman.id.au>
Subject: RE: [PATCH 3/3] powerpc: bpf: implement in-register swap for 64-bit
 endian operations

From: 'Naveen N. Rao'
> Sent: 23 January 2017 19:22
> On 2017/01/15 09:00AM, Benjamin Herrenschmidt wrote:
> > On Fri, 2017-01-13 at 23:22 +0530, 'Naveen N. Rao' wrote:
> > > > That rather depends on whether the processor has a store to load forwarder
> > > > that will satisfy the read from the store buffer.
> > > > I don't know about ppc, but at least some x86 will do that.
> > >
> > > Interesting - good to know that.
> > >
> > > However, I don't think powerpc does that and in-register swap is likely
> > > faster regardless. Note also that gcc prefers this form at higher
> > > optimization levels.
> >
> > Of course powerpc has a load-store forwarder these days, however, I
> > wouldn't be surprised if the in-register form was still faster on some
> > implementations, but this needs to be tested.
> 
> Thanks for clarifying! To test this, I wrote a simple (perhaps naive)
> test that just issues a whole lot of endian swaps and in _that_ test, it
> does look like the load-store forwarder is doing pretty well.
...
> This is all in a POWER8 vm. On POWER7, the in-register variant is around
> 4 times faster than the ldbrx variant.
...

I wonder which is faster on the little 1GHz embedded ppc we use here.

	David