lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <85p0oop-5pq-p6o-7560-297sn1np3os@syhkavp.arg>
Date:   Mon, 30 Nov 2020 12:52:25 -0500 (EST)
From:   Nicolas Pitre <nico@...xnic.net>
To:     Ard Biesheuvel <ardb@...nel.org>
cc:     Antony Yu <swpenim@...il.com>,
        Nick Desaulniers <ndesaulniers@...gle.com>,
        Russell King <linux@...linux.org.uk>,
        Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
        clang-built-linux <clang-built-linux@...glegroups.com>,
        Nathan Chancellor <natechancellor@...il.com>,
        Linux ARM <linux-arm-kernel@...ts.infradead.org>
Subject: Re: [RESEND,PATCH] ARM: fix __div64_32() error when compiling with
 clang

On Mon, 30 Nov 2020, Ard Biesheuvel wrote:

> On Mon, 30 Nov 2020 at 16:51, Nicolas Pitre <nico@...xnic.net> wrote:
> 
> > Here's my version of the fix which should be correct. Warning: this
> > is completely untested, but should in theory produce the same code on
> > modern gcc.
> >
> > diff --git a/arch/arm/include/asm/div64.h b/arch/arm/include/asm/div64.h
> > index 898e9c78a7..595e538f5b 100644
> > --- a/arch/arm/include/asm/div64.h
> > +++ b/arch/arm/include/asm/div64.h
> > @@ -21,29 +21,20 @@
> >   * assembly implementation with completely non standard calling convention
> >   * for arguments and results (beware).
> >   */
> > -
> > -#ifdef __ARMEB__
> > -#define __xh "r0"
> > -#define __xl "r1"
> > -#else
> > -#define __xl "r0"
> > -#define __xh "r1"
> > -#endif
> > -
> >  static inline uint32_t __div64_32(uint64_t *n, uint32_t base)
> >  {
> >         register unsigned int __base      asm("r4") = base;
> >         register unsigned long long __n   asm("r0") = *n;
> >         register unsigned long long __res asm("r2");
> > -       register unsigned int __rem       asm(__xh);
> > -       asm(    __asmeq("%0", __xh)
> > +       unsigned int __rem;
> > +       asm(    __asmeq("%0", "r0")
> >                 __asmeq("%1", "r2")
> > -               __asmeq("%2", "r0")
> > -               __asmeq("%3", "r4")
> > +               __asmeq("%2", "r4")
> >                 "bl     __do_div64"
> > -               : "=r" (__rem), "=r" (__res)
> > -               : "r" (__n), "r" (__base)
> > +               : "+r" (__n), "=r" (__res)
> > +               : "r" (__base)
> >                 : "ip", "lr", "cc");
> > +       __rem = __n >> 32;
> 
> This treats {r0, r1} as a {low, high} pair, regardless of endianness,
> and so it puts the value of r0 into r1. Doesn't that mean the shift
> should only be done on little endian?

Not quite. r0-r1 = low-high is for little endian. Then "__n >> 32" is 
actually translated into "mov r0, r1" to move it into __rem and returned 
through r0.

On big endial it is r0-r1 = high-low. Here "__n >> 32" picks r0 and 
moves it to __rem which is returned through r0 so no extra instruction 
needed.

Of course the function is inlined so r0 can be anything, or optimized 
away if__rem is not used.


Nicolas

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ