[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <493994B35A117E4F832F97C4719C4C04011505C846@orsmsx505.amr.corp.intel.com>
Date: Wed, 18 May 2011 12:04:14 -0700
From: "Yu, Fenghua" <fenghua.yu@...el.com>
To: Ingo Molnar <mingo@...e.hu>
CC: Thomas Gleixner <tglx@...utronix.de>,
H Peter Anvin <hpa@...or.com>,
"Mallick, Asit K" <asit.k.mallick@...el.com>,
Linus Torvalds <torvalds@...ux-foundation.org>,
Avi Kivity <avi@...hat.com>,
Arjan van de Ven <arjan@...radead.org>,
Andrew Morton <akpm@...ux-foundation.org>,
Andi Kleen <andi@...stfloor.org>,
linux-kernel <linux-kernel@...r.kernel.org>
Subject: RE: [PATCH 7/9] x86/lib/memcpy_64.S: Optimize memcpy by enhanced
REP MOVSB/STOSB
> -----Original Message-----
> From: Ingo Molnar [mailto:mingo@...e.hu]
> Sent: Tuesday, May 17, 2011 11:36 PM
> To: Yu, Fenghua
> Cc: Thomas Gleixner; H Peter Anvin; Mallick, Asit K; Linus Torvalds;
> Avi Kivity; Arjan van de Ven; Andrew Morton; Andi Kleen; linux-kernel
> Subject: Re: [PATCH 7/9] x86/lib/memcpy_64.S: Optimize memcpy by
> enhanced REP MOVSB/STOSB
>
>
> * Fenghua Yu <fenghua.yu@...el.com> wrote:
>
> > From: Fenghua Yu <fenghua.yu@...el.com>
> >
> > Support memcpy() with enhanced rep movsb. On processors supporting
> enhanced
> > rep movsb, the alternative memcpy() function using enhanced rep movsb
> > overrides the original function and the fast string function.
> >
> > Signed-off-by: Fenghua Yu <fenghua.yu@...el.com>
> > ---
> > arch/x86/lib/memcpy_64.S | 45 ++++++++++++++++++++++++++++++++----
> ---------
> > 1 files changed, 32 insertions(+), 13 deletions(-)
>
> > ENDPROC(__memcpy)
> >
> > /*
> > - * Some CPUs run faster using the string copy instructions.
> > - * It is also a lot simpler. Use this when possible:
> > - */
> > -
> > - .section .altinstructions, "a"
> > - .align 8
> > - .quad memcpy
> > - .quad .Lmemcpy_c
> > - .word X86_FEATURE_REP_GOOD
> > -
> > - /*
> > + * Some CPUs are adding enhanced REP MOVSB/STOSB feature
> > + * If the feature is supported, memcpy_c_e() is the first choice.
> > + * If enhanced rep movsb copy is not available, use fast string
> copy
> > + * memcpy_c() when possible. This is faster and code is simpler
> than
> > + * original memcpy().
>
> Please use more obvious names than cryptic and meaningless _c and _c_e
> postfixes. We do not repeat these many times.
>
> Also, did you know about the 'perf bench mem memcpy' tool prototype we
> have in
> the kernel tree? It is intended to check and evaluate exactly the
> patches you
> are offering here. The code lives in:
>
> tools/perf/bench/mem-memcpy-arch.h
> tools/perf/bench/mem-memcpy.c
> tools/perf/bench/mem-memcpy-x86-64-asm-def.h
> tools/perf/bench/mem-memcpy-x86-64-asm.S
>
> Please look into testing (fixing if needed), using and extending it:
>
> - We want to measure the alternatives variants as well, not just the
> generic one
>
> - We want to measure memmove, memclear, etc. operations as well, not
> just
> memcpy
>
> - We want cache-cold and cache-hot numbers as well, going along
> multiple sizes
>
> This tool can also useful when developing these changes: they can be
> tested in
> user-space and can be iterated very quickly, without having to build
> and
> booting the kernel.
>
> We can commit any enhancements/fixes you do to perf bench alongside
> your memcpy
> patches. All in one, such measurements will make it much easier for us
> to apply
> the patches.
>
> Thanks,
>
> Ingo
I'll work on the bench tool and will let you know when it's ready.
Thanks.
-Fenghua
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists