lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Sat, 8 Dec 2012 04:13:59 -0800 (PST)
From:	John <da_audiophile@...oo.com>
To:	Borislav Petkov <bp@...en8.de>
Cc:	"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>
Subject: [PATCH] Expand CPU compiler options

Please cc me on replies as I am not a regular subscriber to lkml.  Thank you.



I tested the attached patch written by André Ramnitz using three different machines running a generic x86-64 kernel and an otherwise identical kernel running with the optimized gcc options. 

Conclusion: There are small but real speed increases using a make endpoint to running with this patch.

Details:
1) Three test machines: Intel Xeon X3360, Intel i7-2620M, Intel Core i7-3660K.

2) All ran the make benchmark (linked below) 35 times while booted into a 'generic' kernel. Then all ran the same make benchmark 35 times after booting into an optimized kernel. Below are the optimizations chosen for each machine.
2a) X3360 = core2
2b) i7-2620M = corei7-avx
2c) i7-3660K = core-avx-i
3) Analyzed resulting distributions for statistical significance via ANOVA plots that clearly show statistically significant albeit small differences.

Links to ANOVA plots:
http://s19.postimage.org/68urcofzn/corei7_avx.png
http://s19.postimage.org/ozwomuak3/core_avx_i.png

http://s19.postimage.org/d0l6fj4z7/core2.png


References:

Bash script that controls the benchmark: https://github.com/graysky2/bin/blob/master/bench
Log file generated by script: http://repo-ck.com/bench/compile_time_optimization.txt.gz

---
--- linux-3.6/arch/x86/include/asm/module.h2012-04-29 18:43:58.239336240 -0700
+++ linux-3.6.mod/arch/x86/include/asm/module.h2012-04-29 18:43:41.609545221 -0700
@@ -17,6 +17,14 @@
 #define MODULE_PROC_FAMILY "586MMX "
 #elif defined CONFIG_MCORE2
 #define MODULE_PROC_FAMILY "CORE2 "
+#elif defined CONFIG_MCOREI7
+#define MODULE_PROC_FAMILY "COREI7 "
+#elif defined CONFIG_MCOREI7AVX
+#define MODULE_PROC_FAMILY "COREI7AVX "
+#elif defined CONFIG_MCOREAVXI
+#define MODULE_PROC_FAMILY "COREAVXI "
+#elif defined CONFIG_MCOREAVX2
+#define MODULE_PROC_FAMILY "COREAVX2 "
 #elif defined CONFIG_MATOM
 #define MODULE_PROC_FAMILY "ATOM "
 #elif defined CONFIG_M686
@@ -35,6 +43,14 @@
 #define MODULE_PROC_FAMILY "K7 "
 #elif defined CONFIG_MK8
 #define MODULE_PROC_FAMILY "K8 "
+#elif defined CONFIG_MBARCELONA
+#define MODULE_PROC_FAMILY "BARCELONA "
+#elif defined CONFIG_MBOBCAT
+#define MODULE_PROC_FAMILY "BOBCAT "
+#elif defined CONFIG_MBULLDOZER
+#define MODULE_PROC_FAMILY "BULLDOZER "
+#elif defined CONFIG_MPILEDRIVER
+#define MODULE_PROC_FAMILY "PILEDRIVER "
 #elif defined CONFIG_MELAN
 #define MODULE_PROC_FAMILY "ELAN "
 #elif defined CONFIG_MCRUSOE
--- linux-3.6/arch/x86/Kconfig.cpu2012-04-29 18:43:58.249336198 -0700
+++ linux-3.6.mod/arch/x86/Kconfig.cpu2012-04-29 18:40:46.091751798 -0700
@@ -147,7 +147,7 @@
 
 
 config MK6
-bool "K6/K6-II/K6-III"
+bool "AMD K6/K6-II/K6-III"
 depends on X86_32
 ---help---
   Select this for an AMD K6-family processor.  Enables use of
@@ -155,7 +155,7 @@
   flags to GCC.
 
 config MK7
-bool "Athlon/Duron/K7"
+bool "AMD Athlon/Duron/K7"
 depends on X86_32
 ---help---
   Select this for an AMD Athlon K7-family processor.  Enables use of
@@ -163,12 +163,40 @@
   flags to GCC.
 
 config MK8
-bool "Opteron/Athlon64/Hammer/K8"
+bool "AMD Opteron/Athlon64/Hammer/K8"
 ---help---
   Select this for an AMD Opteron or Athlon64 Hammer-family processor.
   Enables use of some extended instructions, and passes appropriate
   optimization flags to GCC.
 
+config MBARCELONA
+bool "AMD Barcelona"
+---help---
+  Select this for AMD Barcelona and newer processors.
+
+  Enables -march=barcelona
+
+config MBOBCAT
+bool "AMD Bobcat"
+---help---
+  Select this for AMD Bobcat processors.
+
+  Enables -march=btver1
+
+config MBULLDOZER
+bool "AMD Bulldozer"
+---help---
+  Select this for AMD Bulldozer processors.
+
+  Enables -march=bdver1
+
+config MPILEDRIVER
+bool "AMD Piledriver"
+---help---
+  Select this for AMD Piledriver processors.
+
+  Enables -march=bdver2
+
 config MCRUSOE
 bool "Crusoe"
 depends on X86_32
@@ -260,7 +288,7 @@
   in /proc/cpuinfo. Family 15 is an older Xeon, Family 6 a newer one.
 
 config MCORE2
-bool "Core 2/newer Xeon"
+bool "Intel Core 2"
 ---help---
 
   Select this for Intel Core 2 and newer Core 2 Xeons (Xeon 51xx and
@@ -268,6 +296,41 @@
   family in /proc/cpuinfo. Newer ones have 6 and older ones 15
   (not a typo)
 
+  Enables -march=core2
+
+config MCOREI7
+bool "Intel Core i7"
+---help---
+
+  Select this for the Intel Nehalem platform. Intel Nehalem proecessors
+  include Core i3, i5, i7, Xeon: 34xx, 35xx, 55xx, 56xx, 75xx processors.
+
+  Enables -march=corei7
+
+config MCOREI7AVX
+bool "Intel Core 2nd Gen AVX"
+---help---
+
+  Select this for 2nd Gen Core processors including Sandy Bridge.
+
+  Enables -march=corei7-avx
+
+config MCOREAVXI
+bool "Intel Core 3rd Gen AVX"
+---help---
+
+  Select this for 3rd Gen Core processors including Ivy Bridge.
+
+  Enables -march=core-avx-i
+
+config MCOREAVX2
+bool "Intel Core AVX-2"
+---help---
+
+  Select this for AVX-2 enabled processors including Haswell.
+
+  Enables -march=corei7-avx-2
+
 config MATOM
 bool "Intel Atom"
 ---help---
@@ -312,7 +375,7 @@
 config X86_L1_CACHE_SHIFT
 int
 default "7" if MPENTIUM4 || MPSC
-default "6" if MK7 || MK8 || MPENTIUMM || MCORE2 || MATOM || MVIAC7 || X86_GENERIC || GENERIC_CPU
+default "6" if MK7 || MK8 || MBARCELONA || MBOBCAT || MBULLDOZER || MPILEDRIVER || MPENTIUMM || MCORE2 || MCOREI7 || MCOREI7AVX || MCOREAVXI || MCOREAVX2 || MATOM || MVIAC7 || X86_GENERIC || GENERIC_CPU
 default "4" if MELAN || M486 || M386 || MGEODEGX1
 default "5" if MWINCHIP3D || MWINCHIPC6 || MCRUSOE || MEFFICEON || MCYRIXIII || MK6 || MPENTIUMIII || MPENTIUMII || M686 || M586MMX || M586TSC || M586 || MVIAC3_2 || MGEODE_LX
 
@@ -363,11 +426,11 @@
 
 config X86_INTEL_USERCOPY
 def_bool y
-depends on MPENTIUM4 || MPENTIUMM || MPENTIUMIII || MPENTIUMII || M586MMX || X86_GENERIC || MK8 || MK7 || MEFFICEON || MCORE2
+depends on MPENTIUM4 || MPENTIUMM || MPENTIUMIII || MPENTIUMII || M586MMX || X86_GENERIC || MK8 || MK7 || MBARCELONA || MEFFICEON || MCORE2 || MCOREI7 || MCOREI7AVX || MCOREAVXI || MCOREAVX2
 
 config X86_USE_PPRO_CHECKSUM
 def_bool y
-depends on MWINCHIP3D || MWINCHIPC6 || MCYRIXIII || MK7 || MK6 || MPENTIUM4 || MPENTIUMM || MPENTIUMIII || MPENTIUMII || M686 || MK8 || MVIAC3_2 || MVIAC7 || MEFFICEON || MGEODE_LX || MCORE2 || MATOM
+depends on MWINCHIP3D || MWINCHIPC6 || MCYRIXIII || MK7 || MK6 || MPENTIUM4 || MPENTIUMM || MPENTIUMIII || MPENTIUMII || M686 || MK8 || MVIAC3_2 || MVIAC7 || MEFFICEON || MGEODE_LX || MCORE2 || MCOREI7 || MCOREI7AVX || MCOREAVXI || MCOREAVX2 || MATOM
 
 config X86_USE_3DNOW
 def_bool y
@@ -395,17 +458,17 @@
 
 config X86_TSC
 def_bool y
-depends on ((MWINCHIP3D || MCRUSOE || MEFFICEON || MCYRIXIII || MK7 || MK6 || MPENTIUM4 || MPENTIUMM || MPENTIUMIII || MPENTIUMII || M686 || M586MMX || M586TSC || MK8 || MVIAC3_2 || MVIAC7 || MGEODEGX1 || MGEODE_LX || MCORE2 || MATOM) && !X86_NUMAQ) || X86_64
+depends on ((MWINCHIP3D || MCRUSOE || MEFFICEON || MCYRIXIII || MK7 || MK6 || MPENTIUM4 || MPENTIUMM || MPENTIUMIII || MPENTIUMII || M686 || M586MMX || M586TSC || MK8 || MBARCELONA || MBOBCAT || MBULLDOZER || MPILEDRIVER || MVIAC3_2 || MVIAC7 || MGEODEGX1 || MGEODE_LX || MCORE2 || MCOREI7 || MCOREI7-AVX || MATOM) && !X86_NUMAQ) || X86_64
 
 config X86_CMPXCHG64
 def_bool y
-depends on X86_PAE || X86_64 || MCORE2 || MPENTIUM4 || MPENTIUMM || MPENTIUMIII || MPENTIUMII || M686 || MATOM
+depends on X86_PAE || X86_64 || MCORE2 || MCOREI7 || MCOREI7AVX || MCOREAVXI || MCOREAVX2 || MPENTIUM4 || MPENTIUMM || MPENTIUMIII || MPENTIUMII || M686 || MATOM
 
 # this should be set for all -march=.. options where the compiler
 # generates cmov.
 config X86_CMOV
 def_bool y
-depends on (MK8 || MK7 || MCORE2 || MPENTIUM4 || MPENTIUMM || MPENTIUMIII || MPENTIUMII || M686 || MVIAC3_2 || MVIAC7 || MCRUSOE || MEFFICEON || X86_64 || MATOM || MGEODE_LX)
+depends on (MK8 || MBARCELONA || MBOBCAT || MBULLDOZER || MPILEDRIVER || MK7 || MCORE2 || MCOREI7 || MCOREI7AVX || MCOREAVXI || MCOREAVX2 || MPENTIUM4 || MPENTIUMM || MPENTIUMIII || MPENTIUMII || M686 || MVIAC3_2 || MVIAC7 || MCRUSOE || MEFFICEON || X86_64 || MATOM || MGEODE_LX)
 
 config X86_MINIMUM_CPU_FAMILY
 int
--- linux-3.6/arch/x86/Makefile2012-04-29 18:43:58.249336198 -0700
+++ linux-3.6.mod/arch/x86/Makefile2012-04-29 18:41:51.260932642 -0700
@@ -51,10 +51,22 @@
 
         # FIXME - should be integrated in Makefile.cpu (Makefile_32.cpu)
         cflags-$(CONFIG_MK8) += $(call cc-option,-march=k8)
+        cflags-$(CONFIG_MBARCELONA) += $(call cc-option,-march=barcelona)
+        cflags-$(CONFIG_MBOBCAT) += $(call cc-option,-march=btver1)
+        cflags-$(CONFIG_MBULLDOZER) += $(call cc-option,-march=bdver1)
+        cflags-$(CONFIG_MPILEDRIVER) += $(call cc-option,-march=bdver2)
         cflags-$(CONFIG_MPSC) += $(call cc-option,-march=nocona)
 
         cflags-$(CONFIG_MCORE2) += \
-                $(call cc-option,-march=core2,$(call cc-option,-mtune=generic))
+                $(call cc-option,-march=core2,$(call cc-option,-mtune=core2))
+        cflags-$(CONFIG_MCOREI7) += \
+                $(call cc-option,-march=corei7,$(call cc-option,-mtune=corei7))
+        cflags-$(CONFIG_MCOREI7AVX) += \
+                $(call cc-option,-march=corei7-avx,$(call cc-option,-mtune=corei7-avx))
+        cflags-$(CONFIG_MCOREAVXI) += \
+                $(call cc-option,-march=core-avx-i,$(call cc-option,-mtune=core-avx-i))
+        cflags-$(CONFIG_MCOREAVX2) += \
+                $(call cc-option,-march=core-avx2,$(call cc-option,-mtune=core-avx2))
 cflags-$(CONFIG_MATOM) += $(call cc-option,-march=atom) \
 $(call cc-option,-mtune=atom,$(call cc-option,-mtune=generic))
         cflags-$(CONFIG_GENERIC_CPU) += $(call cc-option,-mtune=generic)
--- linux-3.6/arch/x86/Makefile_32.cpu2012-04-29 18:43:58.269335977 -0700
+++ linux-3.6.mod/arch/x86/Makefile_32.cpu2012-04-29 18:42:39.120330975 -0700
@@ -25,6 +25,10 @@
 # They make zero difference whatsosever to performance at this time.
 cflags-$(CONFIG_MK7)+= -march=athlon
 cflags-$(CONFIG_MK8)+= $(call cc-option,-march=k8,-march=athlon)
+cflags-$(CONFIG_MBARCELONA)+= $(call cc-option,-march=barcelona,-march=athlon)
+cflags-$(CONFIG_MBOBCAT)+= $(call cc-option,-march=btver1,-march=athlon)
+cflags-$(CONFIG_MBULLDOZER)+= $(call cc-option,-march=bdver1,-march=athlon)
+cflags-$(CONFIG_MPILEDRIVER)+= $(call cc-option,-march=bdver2,-march=athlon)
 cflags-$(CONFIG_MCRUSOE)+= -march=i686 $(align)-functions=0 $(align)-jumps=0 $(align)-loops=0
 cflags-$(CONFIG_MEFFICEON)+= -march=i686 $(call tune,pentium3) $(align)-functions=0 $(align)-jumps=0 $(align)-loops=0
 cflags-$(CONFIG_MWINCHIPC6)+= $(call cc-option,-march=winchip-c6,-march=i586)
@@ -33,6 +37,10 @@
 cflags-$(CONFIG_MVIAC3_2)+= $(call cc-option,-march=c3-2,-march=i686)
 cflags-$(CONFIG_MVIAC7)+= -march=i686
 cflags-$(CONFIG_MCORE2)+= -march=i686 $(call tune,core2)
+cflags-$(CONFIG_MCOREI7)+= -march=i686 $(call tune,corei7)
+cflags-$(CONFIG_MCOREI7AVX)+= -march=i686 $(call tune,corei7-avx)
+cflags-$(CONFIG_MCOREAVXI)+= -march=i686 $(call tune,core-avx-i)
+cflags-$(CONFIG_MCOREAVX2)+= -march=i686 $(call tune,core-avx-2)
 cflags-$(CONFIG_MATOM)+= $(call cc-option,-march=atom,$(call cc-option,-march=core2,-march=i686)) \
 $(call cc-option,-mtune=atom,$(call cc-option,-mtune=generic))
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ