lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Tue, 10 Nov 2020 17:54:21 -0500
From:   Arvind Sankar <nivedita@...m.mit.edu>
To:     Nick Desaulniers <ndesaulniers@...gle.com>
Cc:     Arvind Sankar <nivedita@...m.mit.edu>,
        Adrian Ratiu <adrian.ratiu@...labora.com>,
        Nathan Chancellor <natechancellor@...il.com>,
        Arnd Bergmann <arnd@...db.de>,
        Linux ARM <linux-arm-kernel@...ts.infradead.org>,
        clang-built-linux <clang-built-linux@...glegroups.com>,
        Russell King <linux@...linux.org.uk>,
        LKML <linux-kernel@...r.kernel.org>,
        Collabora Kernel ML <kernel@...labora.com>,
        Ard Biesheuvel <ardb@...nel.org>
Subject: Re: [PATCH 2/2] arm: lib: xor-neon: disable clang vectorization

On Tue, Nov 10, 2020 at 02:39:59PM -0800, Nick Desaulniers wrote:
> On Tue, Nov 10, 2020 at 2:39 PM Nick Desaulniers
> <ndesaulniers@...gle.com> wrote:
> >
> > On Tue, Nov 10, 2020 at 2:36 PM Nick Desaulniers
> > <ndesaulniers@...gle.com> wrote:
> > >
> > > On Tue, Nov 10, 2020 at 2:15 PM Arvind Sankar <nivedita@...m.mit.edu> wrote:
> > > >
> > > > On Tue, Nov 10, 2020 at 01:41:17PM -0800, Nick Desaulniers wrote:
> > > > > On Mon, Nov 9, 2020 at 11:51 AM Adrian Ratiu <adrian.ratiu@...labora.com> wrote:
> > > > > >
> > > > > > On Fri, 06 Nov 2020, Nick Desaulniers <ndesaulniers@...gle.com>
> > > > > > wrote:
> > > > > > > +#pragma clang loop vectorize(enable)
> > > > > > >         do {
> > > > > > >                 p1[0] ^= p2[0] ^ p3[0] ^ p4[0] ^ p5[0]; p1[1] ^=
> > > > > > >                 p2[1] ^ p3[1] ^ p4[1] ^ p5[1];
> > > > > > > ``` seems to generate the vectorized code.
> > > > > > >
> > > > > > > Why don't we find a way to make those pragma's more toolchain
> > > > > > > portable, rather than open coding them like I have above rather
> > > > > > > than this series?
> > > > > >
> > > > > > Hi again Nick,
> > > > > >
> > > > > > How did you verify the above pragmas generate correct vectorized
> > > > > > code?  Have you tested this specific use case?
> > > > >
> > > > > I read the disassembly before and after my suggested use of pragmas;
> > > > > look for vld/vstr.  You can also add -Rpass-missed=loop-vectorize to
> > > > > CFLAGS_xor-neon.o in arch/arm/lib/Makefile and rebuild
> > > > > arch/arm/lib/xor-neon.o with CONFIG_BTRFS enabled.
> > > > >
> > > >
> > > > https://godbolt.org/z/1oo9M6
> > > >
> > > > With the __restrict__ keywords added, clang seems to vectorize the loop,
> > > > but still reports that vectorization wasn't beneficial -- any idea
> > > > what's going on?
> >
> > Anyways, it's not safe to make that change in the kernel unless you
> > can guarantee that callers of these routines do not alias or overlap.
> 
> s/callers/parameters passed by callers/
> 

Yep, but that seems likely, it doesn't seem like the function would do
anything useful if the destination overlapped one of the sources. The
kernel just doesn't seem to make use of the restrict keyword.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ