lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Wed, 25 Jan 2023 11:16:19 +0000
From:   David Laight <David.Laight@...LAB.COM>
To:     'Will Deacon' <will@...nel.org>,
        Mina Almasry <almasrymina@...gle.com>
CC:     Catalin Marinas <catalin.marinas@....com>,
        Peter Zijlstra <peterz@...radead.org>,
        "linux-arm-kernel@...ts.infradead.org" 
        <linux-arm-kernel@...ts.infradead.org>,
        "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>
Subject: RE: [PATCH v1] arch: Enable function alignment for arm64

From: Will Deacon <will@...nel.org>
> Sent: 24 January 2023 12:09
> 
> On Wed, Dec 07, 2022 at 09:36:48PM -0800, Mina Almasry wrote:
> > We recently ran into a double-digit percentage hackbench regression
> > when backporting commit 12df140f0bdf ("mm,hugetlb: take hugetlb_lock
> > before decrementing h->resv_huge_pages") to an older kernel. This was
> > surprising since hackbench does use hugetlb pages at all and the
> > modified code is not invoked. After some debugging we found that the
> > regression can be fixed by back-porting commit d49a0626216b ("arch:
> > Introduce CONFIG_FUNCTION_ALIGNMENT") and enabling function alignment
> > for arm64. I suggest enabling it by default for arm64 if possible.
> >
...
> 
> This increases the size of .text for a defconfig build by ~2%, so I think it
> would be nice to have some real numbers for the performance uplift. Are you
> able to elaborate beyond "double-digit percentage hackbench regression"?
> 
> In general, however, I'm supportive of the patch (and it seems that x86
> does the same thing) so:

I bet it just changes the alignment of the code so that more
functions are using different cache lines.

All sorts of other random changes are likely to have a similar effect.

Cache-line aligning the start of a function probably reduces the
number of cache lines the functions needs - but that isn't guaranteed.
It also slightly reduces the delay on a cache miss - but they are so
slow it probably makes almost no difference.

	David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ