lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-Id: <20231122092855.4440-1-shijie@os.amperecomputing.com>
Date:   Wed, 22 Nov 2023 17:28:51 +0800
From:   Huang Shijie <shijie@...amperecomputing.com>
To:     catalin.marinas@....com
Cc:     will@...nel.org, mark.rutland@....com, suzuki.poulose@....com,
        broonie@...nel.org, linux-arm-kernel@...ts.infradead.org,
        linux-kernel@...r.kernel.org, anshuman.khandual@....com,
        robh@...nel.org, oliver.upton@...ux.dev, maz@...nel.org,
        patches@...erecomputing.com,
        Huang Shijie <shijie@...amperecomputing.com>
Subject: [PATCH 0/4] arm64: an optimization for AmpereOne

0) Background:
   We found that AmpereOne benefits from aggressive prefetches when
   using 4K page size.

1) This patch:
    1.1) adds new WORKAROUND_AMPERE_AC03_PREFETCH capability.
    1.2) uses MIDR_AMPERE1 to filter the processor.
    1.3) uses alternative_if to alternative the code
         for AmpereOne.
    1.4) adds software prefetches for the specific loop.
    	 Also add a macro add_prefetch.

2) Test result:
    In hugetlb or tmpfs, We can get big seqential read performance improvement
    up to 1.3x ~ 1.4x.


Huang Shijie (4):
  extable: add __sort_main_extable
  arm64: alternative: handle the kernel exception table
  arm64: copy_template.S: add loop_for_copy_128_bytes macro
  arm64: add software prefetches for AmpereOne

 arch/arm64/Kconfig.platforms    |  7 +++
 arch/arm64/kernel/alternative.c | 18 +++++++
 arch/arm64/kernel/cpu_errata.c  |  9 ++++
 arch/arm64/lib/copy_template.S  | 87 +++++++++++++++++++++++----------
 arch/arm64/tools/cpucaps        |  1 +
 include/linux/extable.h         |  2 +
 kernel/extable.c                |  8 ++-
 7 files changed, 105 insertions(+), 27 deletions(-)

-- 
2.40.1

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ