[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <1962730996.73684.1385152385447.JavaMail.zimbra@efficios.com>
Date: Fri, 22 Nov 2013 20:33:05 +0000 (UTC)
From: Mathieu Desnoyers <mathieu.desnoyers@...icios.com>
To: Luis Lozano <llozano@...omium.org>
Cc: Jakub Jelinek <jakub@...hat.com>, Han Shen <shenhan@...omium.org>,
Peter Zijlstra <peterz@...radead.org>,
Catalin Marinas <catalin.marinas@....com>,
Will Deacon <will.deacon@....com>,
Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
Nathan Lynch <Nathan_Lynch@...tor.com>,
lttng-dev@...ts.lttng.org,
Bhaskar Janakiraman <bjanakiraman@...omium.org>,
Alexander Holler <holler@...oftware.de>,
Andrew Morton <akpm@...ux-foundation.org>,
"Paul E. McKenney" <paulmck@...ux.vnet.ibm.com>,
Linus Torvalds <torvalds@...ux-foundation.org>,
Richard Henderson <rth@...ddle.net>
Subject: Re: [lttng-dev] current_thread_info() not respecting program order
with gcc 4.8.x
Very interesting result:
Here is the asm diff between the problematic function compiled with gcc 4.8.2
vs that same function compiled with gcc 4.8.2 with the "lightly tested patch"
in bug http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58854
--- lttng-ring-buffer-client-overwrite.ko-4.8.2.objdump 2013-11-22 13:53:39.634901143 -0600
+++ lttng-ring-buffer-client-overwrite.ko-4.8.2-fix.objdump 2013-11-22 13:56:28.717746721 -0600
@@ -363,9 +363,9 @@ Disassembly of section .text:
504: e0647008 rsb r7, r4, r8
508: e0874004 add r4, r7, r4
50c: e0650004 rsb r0, r5, r4
- 510: e24bd028 sub sp, fp, #40 ; 0x28
+ 510: e58a6000 str r6, [sl]
514: e6ef0070 uxtb r0, r0
- 518: e58a6000 str r6, [sl]
+ 518: e24bd028 sub sp, fp, #40 ; 0x28
51c: e89daff0 ldm sp, {r4, r5, r6, r7, r8, r9, sl, fp, sp, pc}
...
@@ -1938,8 +1938,8 @@ Disassembly of section .text:
1d74: ebfffffe bl 0 <warn_slowpath_null>
1d78: e51b205c ldr r2, [fp, #-92] ; 0x5c
1d7c: eafffef5 b 1958 <lttng_event_reserve+0xa5c>
- 1d80: e24bd028 sub sp, fp, #40 ; 0x28
- 1d84: e51b0048 ldr r0, [fp, #-72] ; 0x48
+ 1d80: e51b0048 ldr r0, [fp, #-72] ; 0x48
+ 1d84: e24bd028 sub sp, fp, #40 ; 0x28
1d88: e89daff0 ldm sp, {r4, r5, r6, r7, r8, r9, sl, fp, sp, pc}
...
1d98: 0000017e .word 0x0000017e
So far, Nathan has not reproduced the issue with the fixed gcc. He's running those
stress tests a couple more hours to get more confidence in the result.
Not sure about the first two stores (they use the stack limit pointer "sl", which
I'm clueless about), but the last snippet clearly fixes a one instruction stack
usage below sp race window. Before the fix:
- 1d80: e24bd028 sub sp, fp, #40 ; 0x28
- 1d84: e51b0048 ldr r0, [fp, #-72] ; 0x48
sp = fp - 40
load from memory location fp - 72 .... wrong !
The full objdumps (before and after gcc fix) are attached.
Thoughts ?
Thanks,
Mathieu
--
Mathieu Desnoyers
EfficiOS Inc.
http://www.efficios.com
Download attachment "gcc bz 58854 fix objdumps.zip" of type "application/zip" (35107 bytes)
Powered by blists - more mailing lists