[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CALCETrWJcB9=MuSw5yx6arcb_np=E=awTyLRSi=r8BJySf_aXw@mail.gmail.com>
Date: Thu, 16 Jan 2020 11:47:36 -0800
From: Andy Lutomirski <luto@...nel.org>
To: Christophe Leroy <christophe.leroy@....fr>
Cc: Benjamin Herrenschmidt <benh@...nel.crashing.org>,
Paul Mackerras <paulus@...ba.org>,
Michael Ellerman <mpe@...erman.id.au>, nathanl@...ux.ibm.com,
Arnd Bergmann <arnd@...db.de>,
Thomas Gleixner <tglx@...utronix.de>,
Vincenzo Frascino <vincenzo.frascino@....com>,
Andrew Lutomirski <luto@...nel.org>,
LKML <linux-kernel@...r.kernel.org>,
linuxppc-dev <linuxppc-dev@...ts.ozlabs.org>,
linux-arm-kernel <linux-arm-kernel@...ts.infradead.org>,
"open list:MIPS" <linux-mips@...r.kernel.org>,
X86 ML <x86@...nel.org>
Subject: Re: [RFC PATCH v4 10/11] lib: vdso: Allow arches to override the ns
shift operation
On Thu, Jan 16, 2020 at 9:58 AM Christophe Leroy
<christophe.leroy@....fr> wrote:
>
> On powerpc/32, GCC (8.1) generates pretty bad code for the
> ns >>= vd->shift operation taking into account that the
> shift is always < 32 and the upper part of the result is
> likely to be nul. GCC makes reversed assumptions considering
> the shift to be likely >= 32 and the upper part to be like not nul.
>
> unsigned long long shift(unsigned long long x, unsigned char s)
> {
> return x >> s;
> }
>
> results in:
>
> 00000018 <shift>:
> 18: 35 25 ff e0 addic. r9,r5,-32
> 1c: 41 80 00 10 blt 2c <shift+0x14>
> 20: 7c 64 4c 30 srw r4,r3,r9
> 24: 38 60 00 00 li r3,0
> 28: 4e 80 00 20 blr
> 2c: 54 69 08 3c rlwinm r9,r3,1,0,30
> 30: 21 45 00 1f subfic r10,r5,31
> 34: 7c 84 2c 30 srw r4,r4,r5
> 38: 7d 29 50 30 slw r9,r9,r10
> 3c: 7c 63 2c 30 srw r3,r3,r5
> 40: 7d 24 23 78 or r4,r9,r4
> 44: 4e 80 00 20 blr
>
> Even when forcing the shift with an &= 31, it still considers
> the shift as likely >= 32.
>
> Define a vdso_shift_ns() macro that can be overriden by
> arches.
Would mul_u64_u64_shr() be a good alternative? Could we adjust it to
assume the shift is less than 32? That function exists to benefit
32-bit arches.
--Andy
Powered by blists - more mailing lists