[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <CAKwvOdm=y-5XHt1V-99Cu+kMVy9Ah6KdE4h2gY+bHgr4eem15A@mail.gmail.com>
Date: Thu, 28 Feb 2019 10:51:17 -0800
From: Nick Desaulniers <ndesaulniers@...gle.com>
To: Robin Murphy <robin.murphy@....com>
Cc: Ard Biesheuvel <ard.biesheuvel@...aro.org>,
Arnd Bergmann <arnd@...db.de>,
Catalin Marinas <catalin.marinas@....com>,
Will Deacon <will.deacon@....com>,
Russell King <linux@...linux.org.uk>,
Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
Mark Brown <broonie@...nel.org>,
Nathan Chancellor <natechancellor@...il.com>,
linux-arm-kernel <linux-arm-kernel@...ts.infradead.org>
Subject: Re: [PATCH] lib/raid6: use vdupq_n_u8 to avoid endianness warnings
On Thu, Feb 28, 2019 at 10:00 AM Robin Murphy <robin.murphy@....com> wrote:
>
> On 26/02/2019 20:44, Nick Desaulniers wrote:
> > On Mon, Feb 25, 2019 at 11:19 PM Ard Biesheuvel
> > <ard.biesheuvel@...aro.org> wrote:
> >>
> >> On Tue, 26 Feb 2019 at 05:03, <ndesaulniers@...gle.com> wrote:
> >>>
> >>> Clang warns: vector initializers are not compatible with NEON intrinsics
> >>> in big endian mode [-Wnonportable-vector-initialization]
> >>>
> >>> While this is usually the case, it's not an issue for this case since
> >>> we're initializing the uint8x16_t (16x uint8_t's) with the same value.
> >>>
> >>> Instead, use vdupq_n_u8 which both compilers lower into a single movi
> >>> instruction: https://godbolt.org/z/vBrgzt
> >>>
> >>> This avoids the static storage for a constant value.
> >>>
> >>> Link: https://github.com/ClangBuiltLinux/linux/issues/214
> >>> Suggested-by: Nathan Chancellor <natechancellor@...il.com>
> >>> Signed-off-by: Nick Desaulniers <ndesaulniers@...gle.com>
> >>
> >> Much better, thanks,
> >>
> >> Did you double check that the intrinsic exists on 32-bit ARM as well?
> >> I assume it does, but please make sure if you haven't yet.
> >
> > Thanks for the review!
> > Looking through Clang's generated arm_neon.h, vdupq_n_u8 seems to have
> > 2 definitions predicated on __LITTLE_ENDIAN__ (not __arch64__ or
> > __ARM_ARCH >= 8 like some of the other types and functions).
> >
> > So NEON got some additions in v8? Is there a doc that lists them?
> > http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.dui0491g/BABDBBJB.html
> > is where I found vdupq_n_u8, but it doesn't seem to mention
> > compatibility (so I assume it's been around since the introduction of
> > NEON?).
>
> FWIW the most recent 'proper' spec document I know of is this one:
>
> http://infocenter.arm.com/help/topic/com.arm.doc.ihi0073b/index.html
Bookmarked, thanks!
Ard, page 171 mentions armv7, armv8 for supported architectures for vdupq_n_u8.
>
>
> Apparently we have a more interactive playground on the new site, too:
>
> https://developer.arm.com/technologies/neon/intrinsics
Also bookmarked! I'm also super happy to see this; I'm familiar with
Intel's equivalent:
https://software.intel.com/sites/landingpage/IntrinsicsGuide/
Interactive sites like these are quite useful. Reading a post
recently: https://www.sigarch.org/simd-instructions-considered-harmful/
"The IA-32 instruction set has grown from 80 to around 1400
instructions since 1978, largely fueled by SIMD."
reminded me how useful and almost necessary the interactive sites are
for navigating the large swathes of SIMD extensions. (no comment on
the title of that article)
--
Thanks,
~Nick Desaulniers
Powered by blists - more mailing lists