[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-Id: <ffe019a1-11b4-4ad7-bbe2-8ef3e01ffeb0@app.fastmail.com>
Date: Tue, 28 Jan 2025 10:25:18 +0100
From: "Arnd Bergmann" <arnd@...db.de>
To: "Julian Vetter" <julian@...er-limits.org>,
"Yoshinori Sato" <ysato@...rs.sourceforge.jp>,
"Rich Felker" <dalias@...c.org>,
"John Paul Adrian Glaubitz" <glaubitz@...sik.fu-berlin.de>
Cc: linux-sh@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH] sh: Remove IO memcpy and memset from sh code
On Tue, Jan 28, 2025, at 09:42, Julian Vetter wrote:
> Remove IO memcpy and memset from sh specific code and fall back to the
> new implementation from lib/iomem_copy.c. It uses word accesses if the
> buffers are aligned and only falls back to byte accesses for potentially
> unaligned parts of a buffer. Keep only the SH4 optimized memcpy_fromio.
>
> Signed-off-by: Julian Vetter <julian@...er-limits.org>
This looks good in pinciple, but I see one mistake:
> +#ifdef CONFIG_CPU_SH4
> +void memcpy_fromio(void *to, const volatile void __iomem *from, size_t
> count)
> {
> /*
> * Would it be worthwhile doing byte and long transfers first
> * to try and get aligned?
> */
> -#ifdef CONFIG_CPU_SH4
> if ((count >= 0x20) &&
> (((u32)to & 0x1f) == 0) && (((u32)from & 0x3) == 0)) {
> int tmp2, tmp3, tmp4, tmp5, tmp6;
> @@ -53,59 +50,6 @@ void memcpy_fromio(void *to, const volatile void
> __iomem *from, unsigned long co
> : "7"(from), "0" (to), "1" (count)
> : "r0", "r7", "t", "memory");
> }
> -#endif
> -
> - if ((((u32)to | (u32)from) & 0x3) == 0) {
> - for (; count > 3; count -= 4) {
> - *(u32 *)to = *(volatile u32 *)from;
> - to += 4;
> - from += 4;
> - }
> - }
> -
The SH4 version still needs the bottom of the function to
handle data that is not a multiple of 32 bytes long.
I would expect gcc to produce a properly optimized
version for sh4 from the generic code as well, so I would
suggest you remove it entirely and rely on the common code
here.
Arnd
Powered by blists - more mailing lists