lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Date:	Mon, 20 Dec 2010 14:30:47 +0800
From:	Miao Xie <miaox@...fujitsu.com>
To:	Hitoshi Mitake <mitake@....info.waseda.ac.jp>
CC:	Ingo Molnar <mingo@...e.hu>,
	Peter Zijlstra <a.p.zijlstra@...llo.nl>,
	linux-kernel@...r.kernel.org, h.mitake@...il.com,
	Ma@....info.waseda.ac.jp,
	"\"Ling@....info.waseda.ac.jp\":" <ling.ma@...el.com>,
	Zhao Yakui <yakui.zhao@...el.com>,
	Arnaldo Carvalho de Melo <acme@...hat.com>,
	Paul Mackerras <paulus@...ba.org>,
	Frederic Weisbecker <fweisbec@...il.com>,
	Steven Rostedt <rostedt@...dmis.org>,
	Thomas Gleixner <tglx@...utronix.de>,
	"H. Peter Anvin" <hpa@...or.com>
Subject: Re: [PATCH 1/2] perf bench: port memcpy_64.S to perf bench

On Sun, 19 Dec 2010 01:25:00 +0900, Hitoshi Mitake wrote:
> On 2010年10月31日 04:21, Ingo Molnar wrote:
>>
>> * Peter Zijlstra<a.p.zijlstra@...llo.nl> wrote:
>>
>>> On Sat, 2010-10-30 at 01:01 +0900, Hitoshi Mitake wrote:
>>>> This patch ports arch/x86/lib/memcpy_64.S to "perf bench mem".
>>>> When PERF_BENCH is defined at preprocessor level,
>>>> memcpy_64.S is preprocessed to includable form from the sources
>>>> under tools/perf for benchmarking programs.
>>>>
>>>> Signed-off-by: Hitoshi Mitake<mitake@....info.waseda.ac.jp>
>>>> Cc: Ma Ling:<ling.ma@...el.com>
>>>> Cc: Zhao Yakui<yakui.zhao@...el.com>
>>>> Cc: Peter Zijlstra<a.p.zijlstra@...llo.nl>
>>>> Cc: Arnaldo Carvalho de Melo<acme@...hat.com>
>>>> Cc: Paul Mackerras<paulus@...ba.org>
>>>> Cc: Frederic Weisbecker<fweisbec@...il.com>
>>>> Cc: Steven Rostedt<rostedt@...dmis.org>
>>>> Cc: Thomas Gleixner<tglx@...utronix.de>
>>>> Cc: H. Peter Anvin<hpa@...or.com>
>>>> ---
>>>> arch/x86/lib/memcpy_64.S | 30 ++++++++++++++++++++++++++++++
>>>> 1 files changed, 30 insertions(+), 0 deletions(-)
>>>>
>>>> diff --git a/arch/x86/lib/memcpy_64.S b/arch/x86/lib/memcpy_64.S
>>>> index 75ef61e..72c6dfe 100644
>>>> --- a/arch/x86/lib/memcpy_64.S
>>>> +++ b/arch/x86/lib/memcpy_64.S
>>>> @@ -1,10 +1,23 @@
>>>> /* Copyright 2002 Andi Kleen */
>>>>
>>>> +/*
>>>> + * perf bench adoption by Hitoshi Mitake
>>>> + * PERF_BENCH means that this file is included from
>>>> + * the source files under tools/perf/ for benchmark programs.
>>>> + *
>>>> + * You don't have to care about PERF_BENCH when
>>>> + * you are working on the kernel.
>>>> + */
>>>> +
>>>> +#ifndef PERF_BENCH
>>>
>>> I don't like littering the actual kernel code with tools/perf/
>>> ifdeffery..
>>
>>
>> Yeah - could we somehow accept that file into a perf build as-is?
>>
>> Thanks,
>>
>> Ingo
>>
>
> Really sorry for my slow work...
>
> BTW, I have a question for Miao and Ingo.
> We are planning to implement new memcpy() of Miao,
> and the important point is not removing previous memcpy()
> for future architectures and benchmarkings.
>
> I feel that adding new CPU feature flag (like X86_FEATURE_REP_GOOD)
> and switching memcpy() with alternative mechanism is good way.
> (So we will have three memcpy()s: rep based, unrolled, and new
> unaligned oriented one)
> But there is another way: #ifdef. Which do you prefer?

I agree with your idea, but Ma Ling said this way may cause the i-cache
miss problem.
   http://marc.info/?l=linux-kernel&m=128746120107953&w=2
(The size of the i-cache is 32K, the size of memcpy() in my patch is 560Byte,
and the size of the last version in tip tree is 400Byte).

But I have not tested it, so I don't know the real result. Maybe we should
try to implement the new memcpy() first.

> And could you tell me the detail of CPU family information
> you are targeting, Miao?

They are  Core2 Duo E7300(Core name: Wolfdale) and Xeon X5260(Core name: Wolfdale-DP).

The following is the detailed information of these two CPU:
Core2 Duo E7300:
vendor_id	: GenuineIntel
cpu family	: 6
model		: 23
model name	: Intel(R) Core(TM)2 Duo CPU     E7300  @ 2.66GHz
stepping	: 6
cpu MHz		: 1603.000
cache size	: 3072 KB
physical id	: 0
siblings	: 2
core id		: 1
cpu cores	: 2
apicid		: 1
initial apicid	: 1
fpu		: yes
fpu_exception	: yes
cpuid level	: 10
wp		: yes
flags		: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx lm constant_tsc arch_perfmon pebs bts rep_good aperfmperf pni dtes64 monitor ds_cpl est tm2 ssse3 cx16 xtpr pdcm sse4_1 lahf_lm dts
bogomips	: 5319.70
clflush size	: 64
cache_alignment	: 64
address sizes	: 36 bits physical, 48 bits virtual
power management:

Xeon X5260:
vendor_id	: GenuineIntel
cpu family	: 6
model		: 23
model name	: Intel(R) Xeon(R) CPU           X5260  @ 3.33GHz
stepping	: 6
cpu MHz		: 1999.000
cache size	: 6144 KB
physical id	: 3
siblings	: 2
core id		: 1
cpu cores	: 2
apicid		: 7
initial apicid	: 7
fpu		: yes
fpu_exception	: yes
cpuid level	: 10
wp		: yes
flags		: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall lm constant_tsc arch_perfmon pebs bts rep_good aperfmperf pni dtes64 monitor ds_cpl vmx est tm2 ssse3 cx16 xtpr pdcm dca sse4_1 lahf_lm dts tpr_shadow vnmi flexpriority
bogomips	: 6649.07
clflush size	: 64
cache_alignment	: 64
address sizes	: 38 bits physical, 48 bits virtual
power management:

Thanks
Miao
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists