[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20160315120147.GA9742@gmail.com>
Date: Tue, 15 Mar 2016 13:01:47 +0100
From: Ingo Molnar <mingo@...nel.org>
To: Linus Torvalds <torvalds@...ux-foundation.org>
Cc: Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
Frédéric Weisbecker <fweisbec@...il.com>,
Thomas Gleixner <tglx@...utronix.de>,
Peter Zijlstra <a.p.zijlstra@...llo.nl>,
Andrew Morton <akpm@...ux-foundation.org>
Subject: Re: [PATCH] atomic: Fix bugs in 'fetch_or()' and rename it to
'xchg_or()'
* Ingo Molnar <mingo@...nel.org> wrote:
> Subject: [PATCH] atomic: Fix bugs in 'fetch_or()' and rename it to 'xchg_or()'
>
> Linus noticed a couple of problems with the fetch_or() implementation introduced
> by 5fd7a09cfb8c ("atomic: Export fetch_or()"):
>
> - Sloppy macro implementation: 'mask' and 'ptr' is evaluated multiple times,
> which will break if arguments have side effects.
So this shiny new patch manages to crash the x86 kernel with a NULL pointer
dereference:
[ 0.143027] BUG: unable to handle kernel NULL pointer dereference at (null)
[ 0.144000] IP: [<ffffffff8107c64c>] resched_curr+0x3c/0xc0
GCC manages to turn this:
static bool set_nr_and_not_polling(struct task_struct *p)
{
struct thread_info *ti = task_thread_info(p);
return !(xchg_or(&ti->flags, _TIF_NEED_RESCHED) & _TIF_POLLING_NRFLAG);
}
and this:
> /**
> + * xchg_or - perform *ptr |= mask atomically and return old value of *ptr
> + * @ptr: pointer to value (cmpxchg() compatible integer pointer type)
> * @mask: mask to OR on the value
> *
> + * cmpxchg() based, it's a macro so it works for different integer types.
> */
> +#ifndef xchg_or
> +# define xchg_or(ptr, mask) \
> +({ \
> + typeof(ptr) __ptr = (ptr); \
> + typeof(mask) __mask = (mask); \
> + \
> + typeof(*(__ptr)) __old, __val = *__ptr; \
> + \
> for (;;) { \
> + __old = cmpxchg(__ptr, __val, __val | __mask); \
> if (__old == __val) \
> break; \
> __val = __old; \
> } \
> + \
> __old; \
> })
into:
41c1: 89 c2 mov %eax,%edx
41c3: 89 d6 mov %edx,%esi
41c5: 31 c9 xor %ecx,%ecx
41c7: 89 d0 mov %edx,%eax
41c9: 83 ce 08 or $0x8,%esi
41cc: f0 0f b1 31 lock cmpxchg %esi,(%rcx)
note the RCX zeroing via XOR...
The original, working sequence is:
41c4: 89 c2 mov %eax,%edx
41c6: 89 d6 mov %edx,%esi
41c8: 89 d0 mov %edx,%eax
41ca: 83 ce 08 or $0x8,%esi
41cd: f0 0f b1 31 lock cmpxchg %esi,(%rcx)
The change that makes the difference is the 'ptr' part of:
> + __old = cmpxchg(__ptr, __val, __val | __mask); \
This variant works:
> + __old = cmpxchg((ptr), __val, __val | __mask); \
After a lot of staring PeterZ realized that __ptr aliases with the x86 cmpxchg()
macro-jungle's __ptr name!!
So if I do a s/__ptr/_ptr it all works...
But IMHO this really highlights a fundamental weakness of all this macro magic,
it's all way too fragile.
Why don't we introduce a boring family of APIs:
cmpxchg_8()
cmpxchg_16()
cmpxchg_32()
cmpxchg_64()
xchg_or_32()
xchg_or_64()
...
... with none of this pesky auto-typing property and none of the
macro-inside-a-macro crap? We could do clean types and would write them all in
proper C, not fragile CPP.
It's not like we migrate between the types all that frequently - and even if we
do, it's trivial.
hm?
Thanks,
Ingo
Powered by blists - more mailing lists