[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <1484492458.11927.17.camel@au1.ibm.com>
Date: Sun, 15 Jan 2017 09:00:58 -0600
From: Benjamin Herrenschmidt <benh@....ibm.com>
To: "'Naveen N. Rao'" <naveen.n.rao@...ux.vnet.ibm.com>,
David Laight <David.Laight@...LAB.COM>
Cc: "daniel@...earbox.net" <daniel@...earbox.net>,
"ast@...com" <ast@...com>,
"netdev@...r.kernel.org" <netdev@...r.kernel.org>,
"linuxppc-dev@...ts.ozlabs.org" <linuxppc-dev@...ts.ozlabs.org>,
"davem@...emloft.net" <davem@...emloft.net>
Subject: Re: [PATCH 3/3] powerpc: bpf: implement in-register swap for 64-bit
endian operations
On Fri, 2017-01-13 at 23:22 +0530, 'Naveen N. Rao' wrote:
> > That rather depends on whether the processor has a store to load forwarder
> > that will satisfy the read from the store buffer.
> > I don't know about ppc, but at least some x86 will do that.
>
> Interesting - good to know that.
>
> However, I don't think powerpc does that and in-register swap is likely
> faster regardless. Note also that gcc prefers this form at higher
> optimization levels.
Of course powerpc has a load-store forwarder these days, however, I
wouldn't be surprised if the in-register form was still faster on some
implementations, but this needs to be tested.
Ideally, you'd want to try to "optimize" load+swap or swap+store
though.
Cheers,
Ben.
Powered by blists - more mailing lists