[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <YNChl0tkofSGzvIX@infradead.org>
Date: Mon, 21 Jun 2021 15:26:31 +0100
From: Christoph Hellwig <hch@...radead.org>
To: Matteo Croce <mcroce@...ux.microsoft.com>
Cc: linux-riscv@...ts.infradead.org, linux-kernel@...r.kernel.org,
linux-arch@...r.kernel.org,
Paul Walmsley <paul.walmsley@...ive.com>,
Palmer Dabbelt <palmer@...belt.com>,
Albert Ou <aou@...s.berkeley.edu>,
Atish Patra <atish.patra@....com>,
Emil Renner Berthing <kernel@...il.dk>,
Akira Tsukamoto <akira.tsukamoto@...il.com>,
Drew Fustini <drew@...gleboard.org>,
Bin Meng <bmeng.cn@...il.com>,
David Laight <David.Laight@...lab.com>,
Guo Ren <guoren@...nel.org>
Subject: Re: [PATCH v3 1/3] riscv: optimized memcpy
On Thu, Jun 17, 2021 at 05:27:52PM +0200, Matteo Croce wrote:
> +extern void *memcpy(void *dest, const void *src, size_t count);
> +extern void *__memcpy(void *dest, const void *src, size_t count);
No need for externs.
> +++ b/arch/riscv/lib/string.c
Nothing in her looks RISC-V specific. Why doesn't this go into lib/ so
that other architectures can use it as well.
> +#include <linux/module.h>
I think you only need export.h.
> +void *__memcpy(void *dest, const void *src, size_t count)
> +{
> + const int bytes_long = BITS_PER_LONG / 8;
> +#ifndef CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS
> + const int mask = bytes_long - 1;
> + const int distance = (src - dest) & mask;
> +#endif
> + union const_types s = { .u8 = src };
> + union types d = { .u8 = dest };
> +
> +#ifndef CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS
> + if (count < MIN_THRESHOLD)
Using IS_ENABLED we can avoid a lot of the mess in this
function.
int distance = 0;
if (!IS_ENABLED(CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS)) {
if (count < MIN_THRESHOLD)
goto copy_remainder;
/* copy a byte at time until destination is aligned */
for (; count && d.uptr & mask; count--)
*d.u8++ = *s.u8++;
distance = (src - dest) & mask;
}
if (distance) {
...
> + /* 32/64 bit wide copy from s to d.
> + * d is aligned now but s is not, so read s alignment wise,
> + * and do proper shift to get the right value.
> + * Works only on Little Endian machines.
> + */
Normal kernel comment style always start with a:
/*
> + for (next = s.ulong[0]; count >= bytes_long + mask; count -= bytes_long) {
Please avoid the pointlessly overlong line. And (just as a matter of
personal preference) I find for loop that don't actually use a single
iterator rather confusing. Wjy not simply:
next = s.ulong[0];
while (count >= bytes_long + mask) {
...
count -= bytes_long;
}
Powered by blists - more mailing lists