lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <609b026d-d54c-4a11-b7df-6ef0ac315f25@app.fastmail.com>
Date: Tue, 03 Dec 2024 11:08:17 +0100
From: "Arnd Bergmann" <arnd@...db.de>
To: "Julian Vetter" <jvetter@...er-limits.org>,
 "Russell King" <linux@...linux.org.uk>
Cc: linux-arm-kernel@...ts.infradead.org, linux-kernel@...r.kernel.org,
 "Julian Vetter" <julian@...er-limits.org>,
 "Linus Walleij" <linus.walleij@...aro.org>
Subject: Re: [PATCH] arm: Remove IO memcpy for Big-Endian

On Tue, Dec 3, 2024, at 09:38, Julian Vetter wrote:
> From: Julian Vetter <julian@...er-limits.org>
>
> Recently a new IO memcpy was added in libs/iomem_copy.c. So, remove the
> byte-wise IO memcpy operations used in ARM big endian builds and fall
> back to the new generic implementation. It will be slightly faster,
> because it uses machine word accesses if the memory is aligned and falls
> back to byte-wise accesses if its not.
>
> Signed-off-by: Julian Vetter <julian@...er-limits.org>
> ---
>  arch/arm/include/asm/io.h | 11 ----------
>  arch/arm/kernel/io.c      | 46 ---------------------------------------
>  2 files changed, 57 deletions(-)

I'm not sure if this is safe on all platforms. Big-endian arm
is extremely rare in practice, and in comes in multiple variants
that behave slightly differently:

- On modern ARMv7 the byte-invariant big-endian "BE8" mode
  generally well-behaved and works as one would expect it to.

- There is one ARMv5 "BE32" based platform, the ixp4xx, which
  works differently, and this in turn allows multiple configurations
  for its buses where a byte-swap is performed in the PCI
  controller.

When the little-endian I/O string operations got optimized to
calling the word-based helpers in commit 7ddfe625cbc1 ("ARM:
optimize memset_io()/memcpy_fromio()/memcpy_toio()"), Russell
intentionally left the big-endian versions alone, which I think
was done for the case of PCI on ixp4xx, but could have been
out of general caution.

Before we apply your patch, I think the minimum would be to
have Linus Walleij try it out on an an ixp4xx with a driver
that uses these functions. Maybe Russell remembers the exact
constraints that led to using byte access for big-endian
mmio string operations, and whether the new lib/iomem_copy.c
version causes problems.

I also looked at the little-endian arm32 version, and
it seems that here the generic code would work fine, but
the custom variant is likely much faster when both the
source and destination buffers are aligned, as it can
do larger MMIO transactions using ldm/stm instructions,
though the generic version would be a bit better if the
in-memory buffer is unaligned.
We could get the best of both by implementing optimized
arm32 versions __iowrite32_copy()/__ioread32_copy and
using those in the generic memcpy_{from,to}io.

       Arnd

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ