[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <200905112330.20509.tobias.doerffel@gmail.com>
Date: Mon, 11 May 2009 23:30:19 +0200
From: Tobias Doerffel <tobias.doerffel@...il.com>
To: Andi Kleen <andi@...stfloor.org>
Cc: LKML <linux-kernel@...r.kernel.org>,
Thomas Gleixner <tglx@...utronix.de>,
Arjan van de Ven <arjan@...radead.org>,
Suresh Siddha <suresh.b.siddha@...el.com>,
"Pallipadi, Venkatesh" <venkatesh.pallipadi@...el.com>,
Ingo Molnar <mingo@...e.hu>, Willy Tarreau <w@....eu>
Subject: Re: Specific support for Intel Atom architecture
Hi,
thanks for your comments. Fixed some of your remarks and attached a new patch.
Am Montag, 4. Mai 2009 09:22:46 schrieb Andi Kleen:
> This is wrong, There are Atom CPUs which support 64bit code too.
Fixed.
> > config X86_XADD
> > def_bool y
> > @@ -355,11 +364,11 @@ config X86_ALIGNMENT_16
> >
> > config X86_INTEL_USERCOPY
> > def_bool y
> > - depends on MPENTIUM4 || MPENTIUMM || MPENTIUMIII || MPENTIUMII ||
> > M586MMX || X86_GENERIC || MK8 || MK7 || MEFFICEON || MCORE2 + depends on
> > MPENTIUM4 || MPENTIUMM || MPENTIUMIII || MPENTIUMII || M586MMX ||
> > X86_GENERIC || MK8 || MK7 || MEFFICEON || MCORE2 || MATOM
>
> I don't think that's necessarily a good idea. You would need benchmarks
> showing that intel user copy performs better on Atom than the original one.
> Do you have some?
You're right here. I made some quick benchmarks of
__copy_user[_intel[_nocache]]() and __copy_zeroing[_intel[_nocache]]() in
userspace and the generic ones indeed were about 15% faster.
> > config X86_USE_PPRO_CHECKSUM
> > def_bool y
> > - depends on MWINCHIP3D || MWINCHIPC6 || MCYRIXIII || MK7 || MK6 ||
> > MPENTIUM4 || MPENTIUMM || MPENTIUMIII || MPENTIUMII || M686 || MK8 ||
> > MVIAC3_2 || MEFFICEON || MGEODE_LX || MCORE2 + depends on MWINCHIP3D ||
> > MWINCHIPC6 || MCYRIXIII || MK7 || MK6 || MPENTIUM4 || MPENTIUMM ||
> > MPENTIUMIII || MPENTIUMII || M686 || MK8 || MVIAC3_2 || MEFFICEON ||
> > MGEODE_LX || MCORE2 || MATOM
>
> Similar here. Atom is quite different from PPro/K8.
Made some benchmarks of csum_partial() and csum_partial_copy_generic() as
well. Here the PPro version of csum_partial() performed 10-15% better
(depending on buffer len) while both implementations of
csum_partial_copy_generic() performed equal.
> > diff --git a/arch/x86/Makefile_32.cpu b/arch/x86/Makefile_32.cpu
> > index 80177ec..07a11b0 100644
> > --- a/arch/x86/Makefile_32.cpu
> > +++ b/arch/x86/Makefile_32.cpu
> > @@ -33,6 +33,7 @@ cflags-$(CONFIG_MCYRIXIII) += $(call
> > cc-option,-march=c3,-march=i486) $(align)-f cflags-$(CONFIG_MVIAC3_2) +=
> > $(call cc-option,-march=c3-2,-march=i686) cflags-$(CONFIG_MVIAC7) +=
> > -march=i686
> > cflags-$(CONFIG_MCORE2) += -march=i686 $(call tune,core2)
> > +cflags-$(CONFIG_MATOM) += -march=atom $(call tune,atom)
> >
> > # AMD Elan support
> > cflags-$(CONFIG_X86_ELAN) += -march=i486
>
> That needs to be in the 64bit version too.
Fixed as well. Also included changes to call cc-option as recommended by hpa.
> > diff --git a/arch/x86/include/asm/module.h
> > b/arch/x86/include/asm/module.h index 47d6274..e959c4a 100644
> > --- a/arch/x86/include/asm/module.h
> > +++ b/arch/x86/include/asm/module.h
> > @@ -28,6 +28,8 @@ struct mod_arch_specific {};
> > #define MODULE_PROC_FAMILY "586MMX "
> > #elif defined CONFIG_MCORE2
> > #define MODULE_PROC_FAMILY "CORE2 "
> > +#elif defined CONFIG_MATOM
> > +#define MODULE_PROC_FAMILY "ATOM "
>
> This should be obsolete anyways, you can just uses CORE2. They have
> compatible ISAs.
So you would recommend writing
#elif defined CONFIG_MCORE2 || defined CONFIG_ATOM
#define MODULE_PROC_FAMILY "CORE2 "
?
Regards,
Tobias
View attachment "0001-x86-add-specific-support-for-Intel-Atom-architectur.patch" of type "text/x-patch" (4957 bytes)
Download attachment "signature.asc " of type "application/pgp-signature" (198 bytes)
Powered by blists - more mailing lists