lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAFnufp1d+FH1K5QAx+Z=KvMUvrveJAVnjJJc8xoDCn2wmzUOoQ@mail.gmail.com>
Date:   Sun, 11 Jul 2021 01:07:49 +0200
From:   Matteo Croce <mcroce@...ux.microsoft.com>
To:     Andrew Morton <akpm@...ux-foundation.org>
Cc:     Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
        Nick Kossifidis <mick@....forth.gr>,
        Guo Ren <guoren@...nel.org>,
        Christoph Hellwig <hch@...radead.org>,
        David Laight <David.Laight@...lab.com>,
        Palmer Dabbelt <palmer@...belt.com>,
        Emil Renner Berthing <kernel@...il.dk>,
        Drew Fustini <drew@...gleboard.org>,
        linux-arch <linux-arch@...r.kernel.org>,
        Nick Desaulniers <ndesaulniers@...gle.com>,
        linux-riscv <linux-riscv@...ts.infradead.org>
Subject: Re: [PATCH v2 0/3] lib/string: optimized mem* functions

On Sat, Jul 10, 2021 at 11:31 PM Andrew Morton
<akpm@...ux-foundation.org> wrote:
>
> On Fri,  2 Jul 2021 14:31:50 +0200 Matteo Croce <mcroce@...ux.microsoft.com> wrote:
>
> > From: Matteo Croce <mcroce@...rosoft.com>
> >
> > Rewrite the generic mem{cpy,move,set} so that memory is accessed with
> > the widest size possible, but without doing unaligned accesses.
> >
> > This was originally posted as C string functions for RISC-V[1], but as
> > there was no specific RISC-V code, it was proposed for the generic
> > lib/string.c implementation.
> >
> > Tested on RISC-V and on x86_64 by undefining __HAVE_ARCH_MEM{CPY,SET,MOVE}
> > and HAVE_EFFICIENT_UNALIGNED_ACCESS.
> >
> > These are the performances of memcpy() and memset() of a RISC-V machine
> > on a 32 mbyte buffer:
> >
> > memcpy:
> > original aligned:      75 Mb/s
> > original unaligned:    75 Mb/s
> > new aligned:          114 Mb/s
> > new unaligned:                107 Mb/s
> >
> > memset:
> > original aligned:     140 Mb/s
> > original unaligned:   140 Mb/s
> > new aligned:          241 Mb/s
> > new unaligned:                241 Mb/s
>
> Did you record the x86_64 performance?
>
>
> Which other architectures are affected by this change?

x86_64 won't use these functions because it defines __HAVE_ARCH_MEMCPY
and has optimized implementations in arch/x86/lib.
Anyway, I was curious and I tested them on x86_64 too, there was zero
gain over the generic ones.

The only architecture which will use all the three function will be
riscv, while memmove() will be used by arc, h8300, hexagon, ia64,
openrisc and parisc.
Keep in mind that memmove() isn't anything special, it just calls
memcpy() when possible (e.g. buffers not overlapping), and fallbacks
to the byte by byte copy otherwise.

In future we can write two functions, one which copies forward and
another one which copies backward, and call the right one depending on
the buffers position.
Then, we could alias memcpy() and memmove(), as proposed by Linus:

https://bugzilla.redhat.com/show_bug.cgi?id=638477#c132

Regards,
-- 
per aspera ad upstream

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ