linux-kernel - RE: [PATCH v5 08/14] arm64: Import latest optimization of memcpy

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Message-ID: <d6b241979664402e907064245ebe5578@AcuMS.aculab.com>
Date:   Thu, 3 Jun 2021 08:45:07 +0000
From:   David Laight <David.Laight@...LAB.COM>
To:     'Robin Murphy' <robin.murphy@....com>,
        Sunil Kovvuri <sunil.kovvuri@...il.com>,
        Oliver Swede <oli.swede@....com>
CC:     Catalin Marinas <catalin.marinas@....com>,
        "will@...nel.org" <will@...nel.org>,
        "linux-arm-kernel@...ts.indradead.org" 
        <linux-arm-kernel@...ts.indradead.org>,
        LKML <linux-kernel@...r.kernel.org>,
        Sunil Goutham <sgoutham@...vell.com>,
        George Cherian <gcherian@...vell.com>
Subject: RE: [PATCH v5 08/14] arm64: Import latest optimization of memcpy

From: Robin Murphy
> Sent: 01 June 2021 13:07
> 
> On 2021-06-01 11:03, Sunil Kovvuri wrote:
> > On Mon, Sep 14, 2020 at 8:44 PM Oliver Swede <oli.swede@....com> wrote:
> >>
> >> From: Sam Tebbs <sam.tebbs@....com>
> >>
> >> Import the latest memcpy implementation into memcpy,
> >> copy_{from, to and in}_user.
> >> The implementation of the user routines is separated into two forms:
> >> one for when UAO is enabled and one for when UAO is disabled, with
> >> the two being chosen between with a runtime patch.
> >> This avoids executing the many NOPs emitted when UAO is disabled.
> >>
> >> The project containing optimized implementations for various library
> >> functions has now been renamed from 'cortex-strings' to
> >> 'optimized-routines', and the new upstream source is
> >> string/aarch64/memcpy.S as of commit 4c175c8be12 in
> >> https://github.com/ARM-software/optimized-routines.
> >>
...
> >
> > Do you have any performance data with this patch ?
> > I see these patches are still not pushed to mainline, any reasons ?
> 
> Funny you should pick up on the 6-month-old thread days after I've been
> posting new versions of the relevant parts[1] :)
> 
> I think this series mostly stalled on the complexity of the usercopy
> parts, which then turned into even more of a moving target anyway, hence
> why I decided to split it up.

It is also worth checking what kind of copy lengths the 'optimized'
routines are actually optimised for.
For instance a sendmsg() system call is likely to do at least 3 short
copy_from_user() requests before even thinking about reading the data buffer.
Even the costs of the comparisons to select between short/long copy
requests become significant on short copies.

I'm not sure you want to be calling
https://github.com/ARM-software/optimized-routines/blob/master/string/aarch64/memcpy.S
for 3 bytes!

	David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)