[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20171219173533.25evvqns4tlxztzj@gmail.com>
Date: Tue, 19 Dec 2017 18:35:33 +0100
From: Ingo Molnar <mingo@...nel.org>
To: Eric Biggers <ebiggers3@...il.com>
Cc: linux-crypto@...r.kernel.org,
Herbert Xu <herbert@...dor.apana.org.au>,
"David S . Miller" <davem@...emloft.net>,
Josh Poimboeuf <jpoimboe@...hat.com>,
Jussi Kivilinna <jussi.kivilinna@....fi>, x86@...nel.org,
linux-kernel@...r.kernel.org, syzkaller-bugs@...glegroups.com,
Eric Biggers <ebiggers@...gle.com>,
Peter Zijlstra <a.p.zijlstra@...llo.nl>
Subject: Re: [PATCH] crypto: x86/twofish-3way - Fix %rbp usage
* Ingo Molnar <mingo@...nel.org> wrote:
>
> * Eric Biggers <ebiggers3@...il.com> wrote:
>
> > There may be a small overhead caused by replacing 'xchg REG, REG' with
> > the needed sequence 'mov MEM, REG; mov REG, MEM; mov REG, REG' once per
> > round. But, counterintuitively, when I tested "ctr-twofish-3way" on a
> > Haswell processor, the new version was actually about 2% faster.
> > (Perhaps 'xchg' is not as well optimized as plain moves.)
>
> XCHG has implicit LOCK semantics on all x86 CPUs, so that's not a surprising
> result I think.
Correction: I think XCHG only implies LOCK if there's a memory operand involved -
register-register XCHG should not imply any barriers.
So the result is indeed unintuitive.
Thanks,
Ingo
Powered by blists - more mailing lists