[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <4AFD1326.506@zytor.com>
Date: Fri, 13 Nov 2009 00:04:54 -0800
From: "H. Peter Anvin" <hpa@...or.com>
To: Ingo Molnar <mingo@...e.hu>
CC: Pavel Machek <pavel@....cz>, "Ma, Ling" <ling.ma@...el.com>,
Ingo Molnar <mingo@...hat.com>,
Thomas Gleixner <tglx@...utronix.de>,
linux-kernel <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH RFC] [X86] performance improvement for memcpy_64.S by
fast string.
On 11/12/2009 11:33 PM, Ingo Molnar wrote:
>
> * Pavel Machek <pavel@....cz> wrote:
>
>>> Ling, if you are interested, could you send a user-space test-app to
>>> this thread that everyone could just compile and run on various older
>>> boxes, to gather a performance profile of hand-coded versus string ops
>>> performance?
>>>
>>> ( And i think we can make a judgement based on cache-hot performance
>>> alone - if then the strings ops will perform comparatively better in
>>> cache-cold scenarios, so the cache-hot numbers would be a conservative
>>> estimate. )
>>
>> Ugh, really? I'd expect cache-cold performance to be not helped at all
>> (memory bandwidth limit) and you'll get slow down from additional
>> i-cache misses...
>
> That's my point - the new code is shorter, which will run comparatively
> faster in a cache-cold environment.
>
memcpy_c by itself is by far the shortest variant, of course.
The question is if it makes sense to use the long variants for short (<
1024 bytes) copies.
-hpa
--
H. Peter Anvin, Intel Open Source Technology Center
I work for Intel. I don't speak on their behalf.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists