lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <86il5uyom3.wl-maz@kernel.org>
Date:   Wed, 22 Nov 2023 12:11:16 +0000
From:   Marc Zyngier <maz@...nel.org>
To:     Mark Rutland <mark.rutland@....com>
Cc:     Will Deacon <will@...nel.org>,
        Huang Shijie <shijie@...amperecomputing.com>,
        catalin.marinas@....com, suzuki.poulose@....com,
        broonie@...nel.org, linux-arm-kernel@...ts.infradead.org,
        linux-kernel@...r.kernel.org, anshuman.khandual@....com,
        robh@...nel.org, oliver.upton@...ux.dev,
        patches@...erecomputing.com
Subject: Re: [PATCH 0/4] arm64: an optimization for AmpereOne

On Wed, 22 Nov 2023 11:40:09 +0000,
Mark Rutland <mark.rutland@....com> wrote:
> 
> On Wed, Nov 22, 2023 at 09:48:57AM +0000, Will Deacon wrote:
> > On Wed, Nov 22, 2023 at 05:28:51PM +0800, Huang Shijie wrote:
> > > 0) Background:
> > >    We found that AmpereOne benefits from aggressive prefetches when
> > >    using 4K page size.
> > 
> > We tend to shy away from micro-architecture specific optimisations in
> > the arm64 kernel as they're pretty unmaintainable, hard to test properly,
> > generally lead to bloat and add additional obstacles to updating our
> > library routines.
> > 
> > Admittedly, we have something for Thunder-X1 in copy_page() (disguised
> > as ARM64_HAS_NO_HW_PREFETCH) but, frankly, that machine needed all the
> > help it could get and given where it is today I suspect we could drop
> > that code without any material consequences.
> > 
> > So I'd really prefer not to merge this; modern CPUs should do better at
> > copying data. It's copy_to_user(), not rocket science.
> 
> I agree, and I'd also like to drop ARM64_HAS_NO_HW_PREFETCH.

+1. Also, as the (most probably) sole user of this remarkable
implementation, I hacked -rc2 to drop ARM64_HAS_NO_HW_PREFETCH. The
result is that a kernel compilation job regressed by 0.4%, something
that I consider being pure noise.

If nobody beats me to it, I'll send the patch.

	M.

-- 
Without deviation from the norm, progress is not possible.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ