[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAHSGOuvW1nN3cjEBpcPYjCxtNGgUODwdFXNUzXR+D5raUq5dOA@mail.gmail.com>
Date: Mon, 15 Aug 2011 18:57:38 +0530
From: melwyn lobo <linux.melwyn@...il.com>
To: Borislav Petkov <bp@...en8.de>,
Denys Vlasenko <vda.linux@...glemail.com>,
Ingo Molnar <mingo@...e.hu>,
melwyn lobo <linux.melwyn@...il.com>,
linux-kernel@...r.kernel.org, "H. Peter Anvin" <hpa@...or.com>,
Thomas Gleixner <tglx@...utronix.de>,
Linus Torvalds <torvalds@...ux-foundation.org>,
Peter Zijlstra <a.p.zijlstra@...llo.nl>,
borislav.petkov@....com
Subject: Re: x86 memcpy performance
Hi,
Was on a vacation for last two days. Thanks for the good insights into
the issue.
Ingo, unfortunately the data we have is on a soon to be released
platform and strictly confidential at this stage.
Boris, thanks for the patch. On seeing your patch:
+void *__sse_memcpy(void *to, const void *from, size_t len)
+{
+ unsigned long src = (unsigned long)from;
+ unsigned long dst = (unsigned long)to;
+ void *p = to;
+ int i;
+
+ if (in_interrupt())
+ return __memcpy(to, from, len)
So what is the reason we cannot use sse_memcpy in interrupt context.
(fpu registers not saved ? )
My question is still not answered. There are 3 versions of memcpy in kernel:
***********************************arch/x86/include/asm/string_32.h******************************
179 #ifndef CONFIG_KMEMCHECK
180
181 #if (__GNUC__ >= 4)
182 #define memcpy(t, f, n) __builtin_memcpy(t, f, n)
183 #else
184 #define memcpy(t, f, n) \
185 (__builtin_constant_p((n)) \
186 ? __constant_memcpy((t), (f), (n)) \
187 : __memcpy((t), (f), (n)))
188 #endif
189 #else
190 /*
191 * kmemcheck becomes very happy if we use the REP instructions
unconditionally,
192 * because it means that we know both memory operands in advance.
193 */
194 #define memcpy(t, f, n) __memcpy((t), (f), (n))
195 #endif
196
197
****************************************************************************************.
I will ignore CONFIG_X86_USE_3DNOW (including mmx_memcpy() ) as this
is valid only for AMD and not for Atom Z5xx series.
This means __memcpy, __constant_memcpy, __builtin_memcpy .
I have a hunch by default we were using __builtin_memcpy. This is
because I see my GCC version >=4 and CONFIG_KMEMCHECK not defined.
Can someone confirm of these 3 which is used, with i386_defconfig.
Again with i386_defconfig which workloads provide the best results
with the default implementation.
thanks,
M.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists