[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <44b42058-c465-4d1e-7710-198754efabe4@suse.com>
Date:   Tue, 19 Dec 2017 09:04:56 +0100
From:   Juergen Gross <jgross@...e.com>
To:     Ingo Molnar <mingo@...nel.org>, Eric Biggers <ebiggers3@...il.com>
Cc:     linux-crypto@...r.kernel.org,
        Herbert Xu <herbert@...dor.apana.org.au>,
        "David S . Miller" <davem@...emloft.net>,
        Josh Poimboeuf <jpoimboe@...hat.com>,
        Jussi Kivilinna <jussi.kivilinna@....fi>, x86@...nel.org,
        linux-kernel@...r.kernel.org, syzkaller-bugs@...glegroups.com,
        Eric Biggers <ebiggers@...gle.com>,
        Peter Zijlstra <a.p.zijlstra@...llo.nl>
Subject: Re: [PATCH] crypto: x86/twofish-3way - Fix %rbp usage
On 19/12/17 08:54, Ingo Molnar wrote:
> 
> * Eric Biggers <ebiggers3@...il.com> wrote:
> 
>> There may be a small overhead caused by replacing 'xchg REG, REG' with
>> the needed sequence 'mov MEM, REG; mov REG, MEM; mov REG, REG' once per
>> round.  But, counterintuitively, when I tested "ctr-twofish-3way" on a
>> Haswell processor, the new version was actually about 2% faster.
>> (Perhaps 'xchg' is not as well optimized as plain moves.)
> 
> XCHG has implicit LOCK semantics on all x86 CPUs, so that's not a surprising 
> result I think.
Exchanging 2 registers can be done without memory access via:
xor reg1, reg2
xor reg2, reg1
xor reg1, reg2
Juergen
Powered by blists - more mailing lists
 
