[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <alpine.DEB.2.00.1711151334200.3893@tp.orcam.me.uk>
Date: Wed, 15 Nov 2017 13:48:06 +0000
From: "Maciej W. Rozycki" <macro@...s.com>
To: Matt Redfearn <matt.redfearn@...s.com>
CC: James Hogan <james.hogan@...s.com>,
Corey Minyard <cminyard@...sta.com>,
Ralf Baechle <ralf@...ux-mips.org>,
Matthew Fortune <matthew.fortune@...s.com>,
<linux-mips@...ux-mips.org>, <linux-kernel@...r.kernel.org>,
"Jason A. Donenfeld" <jason@...c4.com>,
"Paul Burton" <paul.burton@...s.com>
Subject: Re: [PATCH] MIPS: Fix exception entry when CONFIG_EVA enabled
On Wed, 15 Nov 2017, Matt Redfearn wrote:
> I like the change you propose, however I can't coax GAS to reorder the
> instructions appropriately. With this patch on top of 4.14:
>
> --- a/arch/mips/include/asm/stackframe.h
> +++ b/arch/mips/include/asm/stackframe.h
> @@ -195,14 +195,16 @@
> .set push
> .set noat
> .set reorder
> - mfc0 k0, CP0_STATUS
> - sll k0, 3 /* extract cu0 bit */
> - .set noreorder
> - bltz k0, 8f
> - move k0, sp
> + mfc0 k1, CP0_STATUS
> + sll k1, 3 /* extract cu0 bit */
> +
> + move k0, sp
> .if \docfi
> .cfi_register sp, k0
> .endif
> +
> + bltz k1, 8f
> +
> #ifdef CONFIG_EVA
> /*
> * Flush interAptiv's Return Prediction Stack (RPS) by writing
> @@ -228,7 +230,6 @@
> MFC0 k0, CP0_ENTRYHI
> MTC0 k0, CP0_ENTRYHI
> #endif
> - .set reorder
> /* Called from user mode, new stack. */
> get_saved_sp docfi=\docfi tosp=1
> 8:
>
>
> The generated assembly is:
>
> 80405d00 <handle_int>:
> 80405d00: 401b6000 mfc0 k1,c0_status
> 80405d04: 001bd8c0 sll k1,k1,0x3
> 80405d08: 03a0d025 move k0,sp
> 80405d0c: 07600007 bltz k1,80405d2c <handle_int+0x2c>
> 80405d10: 00000000 nop
> 80405d14: 401a2000 mfc0 k0,c0_context
>
> Apparently GAS has not been able to reorder the move into the branch delay
> slot for some reason. Any ideas?
It could be the `.cfi_register' pseudo-op acting as a scheduling barrier.
I think it can be moved further down, beyond the branch, because until
clobbered later on $sp still holds the original value, so using either
register for frame access or the value itself will yield the same result.
Can you send me .i output from the offending source along with GCC
options used to make .o output (use `V=1' with `make' if needed)? I'll
check if my hypothesis is right or find the actual cause otherwise.
Maciej
Powered by blists - more mailing lists