lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <1208333949.4322.5.camel@caritas-dev.intel.com>
Date:	Wed, 16 Apr 2008 16:19:09 +0800
From:	"Huang, Ying" <ying.huang@...el.com>
To:	Sebastian Siewior <linux-crypto@...breakpoint.cc>
Cc:	Herbert Xu <herbert@...dor.apana.org.au>,
	"Adam J. Richter" <adam@...drasil.com>,
	Alexander Kjeldaas <astor@...t.no>, akpm@...ux-foundation.org,
	linux-kernel@...r.kernel.org, linux-crypto@...r.kernel.org,
	mingo@...e.hu, tglx@...utronix.de
Subject: Re: [PATCH -mm crypto] AES: x86_64 asm implementation optimization


On Wed, 2008-04-16 at 09:31 +0200, Sebastian Siewior wrote:
> * Huang, Ying | 2008-04-09 14:41:02 [+0800]:
> 
> >This patch increases the performance of AES x86-64 implementation. The
> >average increment is more than 6.3% and the max increment is
> >more than 10.2% on Intel CORE 2 CPU. The performance increment is
> >gained via the following methods:
> >
> >- Two additional temporary registers are used to hold the subset of
> >  the state, so that the dependency between instructions is reduced.
> >
> >- The expanded key is loaded via 2 64bit load instead of 4 32-bit load.
> >
> 
> From your description I would assume that the performance can only
> increase. However, on my
> |model name      : AMD Athlon(tm) 64 Processor 3200+
> the opposite is the case [1], [2]. I dunno why and I didn't mixup
> patched & unpached :). I checked this patch on

En. I have no AMD machine. So I have not tested the patch on it. Maybe
there are some pipeline or load/store unit difference between Intel and
AMD CPUs. Tomorrow I can split the patch into a set of small patches,
with one patch for one small step. Can you help me to test these patches
to find out the reason for degradation on AMD CPU.

> |model name      : Intel(R) Core(TM)2 CPU         T7200  @ 2.00GHz
> and the performance really increases [3], [4].
> 
> [1] http://download.breakpoint.cc/aes_patch/patched.txt
> [2] http://download.breakpoint.cc/aes_patch/unpatched.txt
> [3] http://download.breakpoint.cc/aes_patch/perf_patched.txt
> [4] http://download.breakpoint.cc/aes_patch/perf_originall.txt
> 
> >---
> > arch/x86/crypto/aes-x86_64-asm_64.S |  101 ++++++++++++++++++++----------------
> > include/crypto/aes.h                |    1 
> > 2 files changed, 58 insertions(+), 44 deletions(-)
> >
> >--- a/include/crypto/aes.h
> >+++ b/include/crypto/aes.h
> >@@ -19,6 +19,7 @@
> > 
> > struct crypto_aes_ctx {
> > 	u32 key_length;
> >+	u32 _pad1;
> 
> Why is this pad required? Do you want special alignment of the keys?

Because the key is loaded in 64bit in this patch, I want to align the
key with 64bit address.

> > 	u32 key_enc[AES_MAX_KEYLENGTH_U32];
> > 	u32 key_dec[AES_MAX_KEYLENGTH_U32];
> > };
> >

Best Regards,
Huang Ying

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ