[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAMj1kXHuERnB01sNrpY9w3C0ECOry7jCK=A2H0D4-_cBXbOmcw@mail.gmail.com>
Date: Mon, 30 Nov 2020 11:40:20 +0100
From: Ard Biesheuvel <ardb@...nel.org>
To: Russell King - ARM Linux admin <linux@...linux.org.uk>
Cc: Antony Yu <swpenim@...il.com>, Nicolas Pitre <nico@...xnic.net>,
Nick Desaulniers <ndesaulniers@...gle.com>,
Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
clang-built-linux <clang-built-linux@...glegroups.com>,
Nathan Chancellor <natechancellor@...il.com>,
Linux ARM <linux-arm-kernel@...ts.infradead.org>
Subject: Re: [RESEND,PATCH] ARM: fix __div64_32() error when compiling with clang
On Mon, 30 Nov 2020 at 11:21, Russell King - ARM Linux admin
<linux@...linux.org.uk> wrote:
>
> On Mon, Nov 30, 2020 at 11:12:33AM +0100, Ard Biesheuvel wrote:
> > (+ Nico)
> >
> > On Mon, 30 Nov 2020 at 11:11, Ard Biesheuvel <ardb@...nel.org> wrote:
> > >
> > > On Mon, 23 Nov 2020 at 08:39, Antony Yu <swpenim@...il.com> wrote:
> > > >
> > > > __do_div64 clobbers the input register r0 in little endian system.
> > > > According to the inline assembly document, if an input operand is
> > > > modified, it should be tied to a output operand. This patch can
> > > > prevent compilers from reusing r0 register after asm statements.
> > > >
> > > > Signed-off-by: Antony Yu <swpenim@...il.com>
> > > > ---
> > > > arch/arm/include/asm/div64.h | 5 +++--
> > > > 1 file changed, 3 insertions(+), 2 deletions(-)
> > > >
> > > > diff --git a/arch/arm/include/asm/div64.h b/arch/arm/include/asm/div64.h
> > > > index 898e9c78a7e7..809efc51e90f 100644
> > > > --- a/arch/arm/include/asm/div64.h
> > > > +++ b/arch/arm/include/asm/div64.h
> > > > @@ -39,9 +39,10 @@ static inline uint32_t __div64_32(uint64_t *n, uint32_t base)
> > > > asm( __asmeq("%0", __xh)
> > > > __asmeq("%1", "r2")
> > > > __asmeq("%2", "r0")
> > > > - __asmeq("%3", "r4")
> > > > + __asmeq("%3", "r0")
> > > > + __asmeq("%4", "r4")
> > > > "bl __do_div64"
> > > > - : "=r" (__rem), "=r" (__res)
> > > > + : "=r" (__rem), "=r" (__res), "=r" (__n)
> > > > : "r" (__n), "r" (__base)
> > > > : "ip", "lr", "cc");
> > > > *n = __res;
> > > > --
> > > > 2.23.0
> > > >
> > >
> > > Agree that using r0 as an input operand only is incorrect, and not
> > > only on Clang. The compiler might assume that r0 will retain its value
> > > across the asm() block, which is obviously not the case.
>
> However, you can _not_ have an asm block that names two outputs using
> the same physical register - that's why both the original patch and
> the posted v2 will fail.
>
> You also can't mark r0 as clobbered because it's used as an operand
> and that is not allowed by gcc.
>
> The fact is, we have two register variables occupying the same register,
> which are __n and __rem. It doesn't matter which endian-ness __rem is,
> r0 will be used for both __n (input) and __rem (output).
>
__rem is a 32-bit variable, so in LE mode, only r1 is used for __rem,
not r0. So r0/r1 are used as an input operand pair, and r1 is used as
an output operand.
So I don't think the compiler has to be buggy in order for it to
assume that r0 will still contain the low word of the dividend
afterwards.
And actually, the same applies on BE, but the other way around. So we
should mark __xl as an output register as well, as __xl will assume
the right value depending on the endianness.
I suggest something like the below,
diff --git a/arch/arm/include/asm/div64.h b/arch/arm/include/asm/div64.h
index 898e9c78a7e7..85ff9109595e 100644
--- a/arch/arm/include/asm/div64.h
+++ b/arch/arm/include/asm/div64.h
@@ -36,12 +36,14 @@ static inline uint32_t __div64_32(uint64_t *n,
uint32_t base)
register unsigned long long __n asm("r0") = *n;
register unsigned long long __res asm("r2");
register unsigned int __rem asm(__xh);
+ register unsigned int __dummy asm(__xl);
asm( __asmeq("%0", __xh)
__asmeq("%1", "r2")
- __asmeq("%2", "r0")
- __asmeq("%3", "r4")
+ __asmeq("%2", __xl)
+ __asmeq("%3", "r0")
+ __asmeq("%4", "r4")
"bl __do_div64"
- : "=r" (__rem), "=r" (__res)
+ : "=r" (__rem), "=r" (__res), "=r"(__dummy)
: "r" (__n), "r" (__base)
: "ip", "lr", "cc");
*n = __res;
> If the compiler can't work out that if a physical register used as an
> output operand will be written by the assembler, then the compiler is
> quite simply buggy.
>
> The code is correct as it stands; Clang is buggy.
>
> --
> RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
> FTTP is here! 40Mbps down 10Mbps up. Decent connectivity at last!
Powered by blists - more mailing lists