linux-kernel - Re: [PATCH] ARM: use assembly mnemonics for VFP register access

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite for Android: free password hash cracker in your pocket

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <CAKwvOdmV80xgvBnhB6ZpqYaqkxKi-_p+StnMojwNnf3kdxTT1A@mail.gmail.com>
Date:   Tue, 25 Feb 2020 11:10:21 -0800
From:   Nick Desaulniers <ndesaulniers@...gle.com>
To:     Stefan Agner <stefan@...er.ch>
Cc:     Russell King <linux@...linux.org.uk>,
        Arnd Bergmann <arnd@...db.de>,
        Manoj Gupta <manojgupta@...gle.com>,
        Jian Cai <jiancai@...gle.com>,
        Linux ARM <linux-arm-kernel@...ts.infradead.org>,
        LKML <linux-kernel@...r.kernel.org>,
        clang-built-linux <clang-built-linux@...glegroups.com>
Subject: Re: [PATCH] ARM: use assembly mnemonics for VFP register access

On Mon, Feb 24, 2020 at 9:22 PM Stefan Agner <stefan@...er.ch> wrote:
>
> Clang's integrated assembler does not allow to to use the mcr
> instruction to access floating point co-processor registers:
> arch/arm/vfp/vfpmodule.c:342:2: error: invalid operand for instruction
>         fmxr(FPEXC, fpexc & ~(FPEXC_EX|FPEXC_DEX|FPEXC_FP2V|FPEXC_VV|FPEXC_TRAP_MASK));
>         ^
> arch/arm/vfp/vfpinstr.h:79:6: note: expanded from macro 'fmxr'
>         asm("mcr p10, 7, %0, " vfpreg(_vfp_) ", cr0, 0 @ fmxr   " #_vfp_ ", %0" \
>             ^
> <inline asm>:1:6: note: instantiated into assembly here
>         mcr p10, 7, r0, cr8, cr0, 0 @ fmxr      FPEXC, r0
>             ^
>
> The GNU assembler supports the .fpu directive at least since 2.17 (when
> documentation has been added). Since Linux requires binutils 2.21 it is
> safe to use .fpu directive. Use the .fpu directive and mnemonics for VFP
> register access.
>
> This allows to build vfpmodule.c with Clang and its integrated assembler.
>
> Link: https://github.com/ClangBuiltLinux/linux/issues/905
> Signed-off-by: Stefan Agner <stefan@...er.ch>
> ---
>  arch/arm/vfp/vfpinstr.h | 12 ++++--------
>  1 file changed, 4 insertions(+), 8 deletions(-)
>
> diff --git a/arch/arm/vfp/vfpinstr.h b/arch/arm/vfp/vfpinstr.h
> index 38dc154e39ff..799ccf065406 100644
> --- a/arch/arm/vfp/vfpinstr.h
> +++ b/arch/arm/vfp/vfpinstr.h
> @@ -62,21 +62,17 @@
>  #define FPSCR_C (1 << 29)
>  #define FPSCR_V        (1 << 28)
>
> -/*
> - * Since we aren't building with -mfpu=vfp, we need to code
> - * these instructions using their MRC/MCR equivalents.
> - */
> -#define vfpreg(_vfp_) #_vfp_
> -
>  #define fmrx(_vfp_) ({                 \
>         u32 __v;                        \
> -       asm("mrc p10, 7, %0, " vfpreg(_vfp_) ", cr0, 0 @ fmrx   %0, " #_vfp_    \
> +       asm(".fpu       vfpv2\n"        \
> +           "vmrs       %0, " #_vfp_    \
>             : "=r" (__v) : : "cc");     \
>         __v;                            \
>   })
>
>  #define fmxr(_vfp_,_var_)              \
> -       asm("mcr p10, 7, %0, " vfpreg(_vfp_) ", cr0, 0 @ fmxr   " #_vfp_ ", %0" \
> +       asm(".fpu       vfpv2\n"        \
> +           "vmsr       " #_vfp_ ", %0" \
>            : : "r" (_var_) : "cc")
>
>  u32 vfp_single_cpdo(u32 inst, u32 fpscr);
> --

Hi Stefan,
Thanks for the patch.  Reading through:
- FMRX, FMXR, and FMSTAT:
http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.dui0068b/Bcfbdihi.html
- VMRS and VMSR:
http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.dui0204h/Bcfbdihi.html

Should a macro called `fmrx` that had a comment about `fmrx` be using
`vmrs` in place of `fmrx`?

It looks like Clang treats them the same, but GCC keeps them separate:
https://godbolt.org/z/YKmSAs
Ah, this is only when streaming to assembly. Looks like they have the
same encoding, and produce the same disassembly. (Godbolt emits
assembly by default, and has the option to compile, then disassemble).
If I take my case from godbolt above:

➜  /tmp arm-linux-gnueabihf-gcc -O2 -c x.c
➜  /tmp llvm-objdump -dr x.o

x.o: file format elf32-arm-little


Disassembly of section .text:

00000000 bar:
       0: f1 ee 10 0a                  vmrs r0, fpscr
       4: 70 47                        bx lr
       6: 00 bf                        nop

00000008 baz:
       8: f1 ee 10 0a                  vmrs r0, fpscr
       c: 70 47                        bx lr
       e: 00 bf                        nop

So indeed a similar encoding exists for the two different assembler
instructions.
Reviewed-by: Nick Desaulniers <ndesaulniers@...gle.com>


-- 
Thanks,
~Nick Desaulniers