lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAHSGOuvW1nN3cjEBpcPYjCxtNGgUODwdFXNUzXR+D5raUq5dOA@mail.gmail.com>
Date:	Mon, 15 Aug 2011 18:57:38 +0530
From:	melwyn lobo <linux.melwyn@...il.com>
To:	Borislav Petkov <bp@...en8.de>,
	Denys Vlasenko <vda.linux@...glemail.com>,
	Ingo Molnar <mingo@...e.hu>,
	melwyn lobo <linux.melwyn@...il.com>,
	linux-kernel@...r.kernel.org, "H. Peter Anvin" <hpa@...or.com>,
	Thomas Gleixner <tglx@...utronix.de>,
	Linus Torvalds <torvalds@...ux-foundation.org>,
	Peter Zijlstra <a.p.zijlstra@...llo.nl>,
	borislav.petkov@....com
Subject: Re: x86 memcpy performance

Hi,
Was on a vacation for last two days. Thanks for the good insights into
the issue.
Ingo, unfortunately the data we have is on a soon to be released
platform and strictly confidential at this stage.

Boris, thanks for the patch. On seeing your patch:
+void *__sse_memcpy(void *to, const void *from, size_t len)
+{
+       unsigned long src = (unsigned long)from;
+       unsigned long dst = (unsigned long)to;
+       void *p = to;
+       int i;
+
+       if (in_interrupt())
+               return __memcpy(to, from, len)
So what is the reason we cannot use sse_memcpy in interrupt context.
(fpu registers not saved ? )
My question is still not answered. There are 3 versions of memcpy in kernel:

***********************************arch/x86/include/asm/string_32.h******************************
179 #ifndef CONFIG_KMEMCHECK
180
181 #if (__GNUC__ >= 4)
182 #define memcpy(t, f, n) __builtin_memcpy(t, f, n)
183 #else
184 #define memcpy(t, f, n)                         \
185         (__builtin_constant_p((n))              \
186          ? __constant_memcpy((t), (f), (n))     \
187          : __memcpy((t), (f), (n)))
188 #endif
189 #else
190 /*
191  * kmemcheck becomes very happy if we use the REP instructions
unconditionally,
192  * because it means that we know both memory operands in advance.
193  */
194 #define memcpy(t, f, n) __memcpy((t), (f), (n))
195 #endif
196
197
****************************************************************************************.
I will ignore CONFIG_X86_USE_3DNOW (including mmx_memcpy() ) as this
is valid only for AMD and not for Atom Z5xx series.
This means __memcpy, __constant_memcpy, __builtin_memcpy .
I have a hunch by default we were using  __builtin_memcpy. This  is
because I see my GCC version >=4 and CONFIG_KMEMCHECK not defined.
Can someone confirm of these 3 which is used, with i386_defconfig.
Again with i386_defconfig which workloads provide the best results
with the default implementation.

thanks,
M.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ