[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CACT4Y+Yhy-jucOC37um5xZewEj0sdw8Hjte7oOYxDdxkzOTYoA@mail.gmail.com>
Date: Wed, 28 Jun 2017 12:16:26 +0200
From: Dmitry Vyukov <dvyukov@...gle.com>
To: Sebastian Andrzej Siewior <bigeasy@...utronix.de>
Cc: Ingo Molnar <mingo@...hat.com>,
Mark Rutland <mark.rutland@....com>,
Peter Zijlstra <peterz@...radead.org>,
Will Deacon <will.deacon@....com>,
"H. Peter Anvin" <hpa@...or.com>,
Andrey Ryabinin <aryabinin@...tuozzo.com>,
kasan-dev <kasan-dev@...glegroups.com>,
"x86@...nel.org" <x86@...nel.org>,
LKML <linux-kernel@...r.kernel.org>,
Andrew Morton <akpm@...ux-foundation.org>,
"linux-mm@...ck.org" <linux-mm@...ck.org>,
Linus Torvalds <torvalds@...ux-foundation.org>,
Thomas Gleixner <tglx@...utronix.de>
Subject: Re: [PATCH] locking/atomics: don't alias ____ptr
On Wed, Jun 28, 2017 at 12:02 PM, Sebastian Andrzej Siewior
<bigeasy@...utronix.de> wrote:
> Trying to boot tip/master resulted in:
> |DMAR: dmar0: Using Queued invalidation
> |DMAR: dmar1: Using Queued invalidation
> |DMAR: Setting RMRR:
> |DMAR: Setting identity map for device 0000:00:1a.0 [0xbdcf9000 - 0xbdd1dfff]
> |BUG: unable to handle kernel NULL pointer dereference at (null)
> |IP: __domain_mapping+0x10f/0x3d0
> |PGD 0
> |P4D 0
> |
> |Oops: 0002 [#1] PREEMPT SMP
> |Modules linked in:
> |CPU: 19 PID: 1 Comm: swapper/0 Not tainted 4.12.0-rc6-00117-g235a93822a21 #113
> |task: ffff8805271c2c80 task.stack: ffffc90000058000
> |RIP: 0010:__domain_mapping+0x10f/0x3d0
> |RSP: 0000:ffffc9000005bca0 EFLAGS: 00010246
> |RAX: 0000000000000000 RBX: 00000000bdcf9003 RCX: 0000000000000000
> |RDX: 0000000000000000 RSI: 0000000000000001 RDI: 0000000000000001
> |RBP: ffffc9000005bd00 R08: ffff880a243e9780 R09: ffff8805259e67c8
> |R10: 00000000000bdcf9 R11: 0000000000000000 R12: 0000000000000025
> |R13: 0000000000000025 R14: 0000000000000000 R15: 00000000000bdcf9
> |FS: 0000000000000000(0000) GS:ffff88052acc0000(0000) knlGS:0000000000000000
> |CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> |CR2: 0000000000000000 CR3: 0000000001c0f000 CR4: 00000000000406e0
> |Call Trace:
> | iommu_domain_identity_map+0x5a/0x80
> | domain_prepare_identity_map+0x9f/0x160
> | iommu_prepare_identity_map+0x7e/0x9b
>
> bisect points to commit 235a93822a21 ("locking/atomics, asm-generic: Add KASAN
> instrumentation to atomic operations"), RIP is at
> tmp = cmpxchg64_local(&pte->val, 0ULL, pteval);
> in drivers/iommu/intel-iommu.c. The assembly for this inline assembly
> is:
> xor %edx,%edx
> xor %eax,%eax
> cmpxchg %rbx,(%rdx)
>
> and as you see edx is set to zero and used later as a pointer via the
> full register. This happens with gcc-6, 5 and 8 (snapshot from last
> week).
> After a longer while of searching and swearing I figured out that this
> bug occures once cmpxchg64_local() and cmpxchg_local() uses the same
> ____ptr macro and they are shadow somehow. What I don't know why edx is
> set to zero.
>
> Signed-off-by: Sebastian Andrzej Siewior <bigeasy@...utronix.de>
> ---
> include/asm-generic/atomic-instrumented.h | 12 ++++++------
> 1 file changed, 6 insertions(+), 6 deletions(-)
>
> diff --git a/include/asm-generic/atomic-instrumented.h b/include/asm-generic/atomic-instrumented.h
> index a0f5b7525bb2..ac6155362b39 100644
> --- a/include/asm-generic/atomic-instrumented.h
> +++ b/include/asm-generic/atomic-instrumented.h
> @@ -359,16 +359,16 @@ static __always_inline bool atomic64_add_negative(s64 i, atomic64_t *v)
>
> #define cmpxchg64(ptr, old, new) \
> ({ \
> - __typeof__(ptr) ____ptr = (ptr); \
> - kasan_check_write(____ptr, sizeof(*____ptr)); \
> - arch_cmpxchg64(____ptr, (old), (new)); \
> + __typeof__(ptr) ____ptr64 = (ptr); \
> + kasan_check_write(____ptr64, sizeof(*____ptr64));\
> + arch_cmpxchg64(____ptr64, (old), (new)); \
> })
>
> #define cmpxchg64_local(ptr, old, new) \
> ({ \
> - __typeof__(ptr) ____ptr = (ptr); \
> - kasan_check_write(____ptr, sizeof(*____ptr)); \
> - arch_cmpxchg64_local(____ptr, (old), (new)); \
> + __typeof__(ptr) ____ptr64 = (ptr); \
> + kasan_check_write(____ptr64, sizeof(*____ptr64));\
> + arch_cmpxchg64_local(____ptr64, (old), (new)); \
> })
>
> #define cmpxchg_double(p1, p2, o1, o2, n1, n2) \
Doh! Thanks for fixing this. I think I've a similar crash in a
different place when I developed the patch.
The problem is that when we do:
__typeof__(ptr) ____ptr = (ptr); \
arch_cmpxchg64_local(____ptr, (old), (new)); \
We don't necessary pass value of our just declared ____ptr to
arch_cmpxchg64_local(). We just pass a symbolic identifier. So if
arch_cmpxchg64_local() declares own ____ptr and then tries to use what
we passed ("____ptr") it will actually refer to own variable declared
rather than to what we wanted to pass in.
In my case I ended up with something like:
__typeof__(foo) __ptr = __ptr;
which compiler decided to turn into 0.
Thank you, macros.
We can add more underscores, but the problem can happen again. Should
we prefix current function/macro name to all local vars?..
Powered by blists - more mailing lists