[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CPSMTPM-CMT109vpIa300062dcf@CPSMTPM-CMT109.kpnxchange.com>
Date: Wed, 23 Nov 2011 13:51:50 +0100
From: "N. Coesel" <nico@...dev.nl>
To: Sasha Levin <levinsasha928@...il.com>
Cc: linux-kernel@...r.kernel.org
Subject: Re: Fast memcpy patch
Sasha,
At 13:10 23-11-2011, Sasha Levin wrote:
>On Wed, 2011-11-23 at 12:25 +0100, N. Coesel wrote:
> > Dear readers,
> > I noticed the Linux kernel still uses a byte-by-byte copy method for
> > memcpy. Since most memory allocations are aligned to the integer size
> > of a cpu it is often faster to copy by using the CPU's native word
> > size. The patch below does that. The code is already at work in many
> > 16 and 32 bit embedded products. It should also work for 64 bit
> > platforms. So far I only tested 16 and 32 bit platforms.
>
>[snip]
>
>memcpy (along with other mem* functions) are arch specific - for
>example, look at arch/x86/lib/memcpy_64.S for the implementation(s) for
>x86.
>
>The code under lib/string.c is simple and should work on all platforms
>(and is probably not being used anywhere anymore).
Thanks for pointing that out. Currently my primary target is ARM. It
seems the memcpy for that arch uses byte-by-byte copying as well with
some loop unrolling. I modified the code so it tries to use
word-by-word copy if the pointers are aligned on word boundaries, if
not it reverts to the old method. For clarity: by word I mean the
CPU's native bus width. In case of ARM that's (still) 32 bit.
Nico Coesel
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists