linux-kernel - Re: [PATCH] MIPS: UAPI: Fix unrecognized opcode WSBH/DSBH/DSHD when using MIPS16.

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Sat, 5 Sep 2015 15:25:58 +0100 (BST)
From:	"Maciej W. Rozycki" <macro@...ux-mips.org>
To:	Yousong Zhou <yszhou4tech@...il.com>
cc:	Ralf Baechle <ralf@...ux-mips.org>, Chen Jie <chenj@...ote.com>,
	linux-mips@...ux-mips.org, linux-kernel@...r.kernel.org,
	Florian Fainelli <florian@...nwrt.org>
Subject: Re: [PATCH] MIPS: UAPI: Fix unrecognized opcode WSBH/DSBH/DSHD when
 using MIPS16.

On Sat, 5 Sep 2015, Yousong Zhou wrote:

> >  This can't be true.  The compiler does not intepret the contents of an
> > inline asm and therefore cannot decide whether to inline a function
> > containing one or not based on the lone presence or the absence of an
> > assembly directive within.
> >
> 
> Most of the time I trust my compiler and never meddle with the
> toolchain.  Anyway I made a patch because it really did not work for
> me.   No big deal.  It's not the end of world.  It started with a
> comment from OpenWrt packages feeds [1].  Actually this "unrecognized
> opcode" problem have occurred within OpenWrt quite a few times before.
> 
>  [1] https://github.com/openwrt/packages/commit/1e29676a8ac74f797f8ca799364681cec575ae6f#commitcomment-12901931

 The bug certainly was there, it's just your analysis and consequently the 
fix that are wrong in the general case for some reason, maybe a buggy 
compiler.

> >  It looks to me you're papering something over here -- when you use a
> > `.set nomips16' directive the assembler will happily switch your
> > instruction set in the middle of an instruction stream.  Consequently if
> > this function is (for whatever reason; it should not really) inlined in
> > MIPS16 code, then you'll get a MIPS32 instruction in the middle, which
> > will obviously be interpreted differently in the MIPS16 execution mode and
> > is therefore surely a recipe for disaster.
> 
> If by "papering" you mean "made up", then whatever.  Yeah, it's
> disaster, an "invalid instruction" abort.

 By "papering over" I mean forcing source code to compile successfully at 
the risk of producing incorrect binary code.

> >  How did you test your change and made the conclusion quoted with your
> > patch description?
> >
> 
> Compile the following program with a MIPS 24kc big endian variant compiler with
> flag "-mips32r2 -mips16 -Os".
> 
>     #include <stdio.h>
>     #include <stdint.h>
> 
>     uint16_t __attribute__((noinline)) f(uint16_t v)
>     {
>         v = __cpu_to_le16(v);
>         return v;
>     }
> 
>     int main()
>     {
>         printf("%x\n", f(0xbeef));
> 
>         return 0;
>     }
> 
> When only ".set nomips16" was specified in __arch_swab16(), this was output
> from objdump.
> 
>     242 004003e0 <f>:
>     243   4003e0:       7c0410a0        wsbh    v0,a0
>     244   4003e4:       e820ea31        swc2    $0,-5583(at)
>     245   4003e8:       65006500        0x65006500
>     246   4003ec:       65006500        0x65006500

 Quite obviously.

 For the record: the first instruction has been assembled in the regular 
MIPS mode and that propagated to symbol annotation.  Consequently `f' is 
seen by `objdump' as a regular MIPS function and disassembles it all as 
such.  You can put a global label immediately after the WSBH instruction 
in your source code to have the rest of the function disassembled 
correctly (of course this won't make this code work at the run time).

> __arch_swab16() was indeed inlined.  That swc2 instruction can be got from
> assembler with the following code (it's from the "-S" result of GCC).
> 
>     .set    mips16
>     .set    noreorder
>     .set    nomacro
>     j       $31
>     zeh     $2
> 
> When only nomips16 function attribute was specified, this time GCC failed with
> unrecognized opcode
> 
>     /tmp/ccaGCouL.s: Assembler messages:
>     /tmp/ccaGCouL.s:21: Error: unrecognized opcode `wsbh $2,$4'
> 
> The generated assembly was something in the following form.  Looks like the
> assembler did not automatically switch to MIPS32 mode when ".set arch=mip32r2"

 There's no switch to regular MIPS mode implied with `.set arch=mip32r2', 
the directive merely switches the ISA level, affecting both the regular 
MIPS and the MIPS16 mode (the MIPS32r2 ISA level adds extra MIPS16 
instructions too, e.g. ZEH is a new addition).

>         .set mips16
> 
>         .ent f
>         .type f, @function
>     f
>         ...
>         .set push
>         .set arch=mips32r2
>         wsbh $2,$4
>         .pop
>         j $31
>         zeh $2
>         .end f
>         ...

 That's exactly the papered-over buggy code scenario I've been referring 
to above.  This is clearly MIPS16 code: ZEH is a MIPS16 instruction only, 
there's no such regular MIPS counterpart.  And it also obviously fails to 
assemble because on the contrary there's no MIPS16 WSBH instruction.

 Now if you stick `.set nomips16' just above WSBH, then this code will 
happily assemble, because this single instruction only (`.set pop' reverts 
any previous `.set' directives; I'm assuming you wrote above by hand and 
`.pop' is a typo) will assemble in the regular MIPS instruction mode.  But 
if this code is ever reached, then the processor will still execute the 
machine code produced by the assembler from the WSBH instruction in the 
MIPS16 mode.

 For example the encoding of:

	wsbh	$2, $4

is (as you've shown in a dump above) 0x7c0410a0, which in the MIPS16 mode 
yields (in the big-endian mode):

00000000 <f>:
   0:	7c04      	sd	s0,32(a0)
   2:	10a0      	b	144 <f+0x144>

-- so this won't do what you might expect, you'll get a Reserved 
Instruction exception on the SD instruction, which is not supported in the 
32-bit mode, and consequently SIGILL.

> The patch was run tested on QEMU Malta and an router with Atheros
> AR9331 SoC.  I didn't test __arch_swab64() though.  I have done many
> other trial-and-error tests while preparing this patch.  It was a mess
> when I was sure I should expect some sensible behaviour from the
> compiler while it actually just did not behave that way.

 I've compiled your example provided and as I stated in the original mail 
`__arch_swab16' is always produced as a separate function, whether `.set 
nomips16' is present in the inline assembly placed there or not.  This is 
with (unreleased) GCC 6.0.0.

 However if you happen to have a buggy compiler that fails to emit 
`__arch_swab16' as a separate function despite the `nomips16' attribute, 
then it's better if the resulting generated assembly code fails to 
assemble rather than if it goes astray at the run time.

> >  So setting aside the correctness issues discussed above, for MIPS16 code
> > this has to be put out of line by the compiler, with all the usual
> > overhead of a function call, causing a performance loss rather than a gain
> > expected here.  Especially as switching the ISA mode requires draining the
> > whole pipeline so that subsequent instructions are interpreted in the new
> > execution mode.  This is expensive in performance terms.
> >
> >  I'm fairly sure simply disabling these asms (#ifndef __mips16) and
> > letting the compiler generate the right mask and shift operations from
> > plain C code will be cheaper in performance terms and possibly cheaper in
> > code size as well, especially in otherwise leaf functions where an extra
> > function call will according to the ABI clobber temporaries.  So I suggest
> > going in that direction instead.
> 
> I agree.  Then you will provide the fix right?  I am just curious
> where that __mips16 should be placed or is it from compiler and
> assembler?

 No, it's your bug after all.  I think the last paragraph I wrote quoted 
above combined with the source code in question make it clear what to do.

 I have also checked what the difference in generated MIPS16 code is 
between a call to `__arch_swab16':

0000000c <f>:
   c:	64c4      	save	32,ra
   e:	1800 0000 	jal	0 <__arch_swab16>
			e: R_MIPS16_26	__arch_swab16
  12:	ec31      	zeh	a0
  14:	6444      	restore	32,ra
  16:	e8a0      	jrc	ra

and equivalent generic code:

0000000c <f>:
   c:	ec31      	zeh	a0
   e:	3280      	sll	v0,a0,8
  10:	3482      	srl	a0,8
  12:	ea8d      	or	v0,a0
  14:	e820      	jr	ra
  16:	ea31      	zeh	v0

so the win is I think clear.

 Finally the MIPS64 `__arch_swab64' case does not matter as we have no 
MIPS16 support for 64-bit code in Linux, the toolchain will simply refuse 
to build it (only bare metal is supported).

  Maciej
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/