lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <alpine.DEB.2.21.1809151726290.1650@nanos.tec.linutronix.de>
Date:   Sat, 15 Sep 2018 17:34:24 +0200 (CEST)
From:   Thomas Gleixner <tglx@...utronix.de>
To:     Rich Felker <dalias@...c.org>
cc:     LKML <linux-kernel@...r.kernel.org>,
        Peter Zijlstra <peterz@...radead.org>,
        Ingo Molnar <mingo@...nel.org>
Subject: Re: futex_cmpxchg_enabled breakage

On Thu, 30 Aug 2018, Rich Felker wrote:
> On Thu, Aug 30, 2018 at 11:19:58AM +0200, Thomas Gleixner wrote:
> > On Wed, 29 Aug 2018, Rich Felker wrote:
> > 
> > > I just spent a number of hours helping someone track down a bug that
> > > looks like it's some kind of futex_cmpxchg_enabled detection error on
> > > powerpc64 (still not sure of the root cause; set_robust_list producing
> > > -ENOSYS), and a while back I hit the same problem on sh2 due to lack
> > > of EFAULT on nommu, leading to commit 72cc564f16ca. I think the test
> > > (introduced way back in commit a0c1e9073ef7) is fundamentally buggy;
> > > if anything, it should be checking for !=-ENOSYS, not ==-EFAULT.
> > 
> > Errm? This does a futex_cmpxchg() on NULL which has to return EFAULT if
> > it's available. There is nothing fundamentally buggy about it at all.
> 
> I'll let you know when if/when we finish figuring out how this
> happened on powerpc64, but it's an arch that most certainly has a
> working cmpxchg where these syscalls ended up returning -ENOSYS. This
> is an error condition that should not be able to happen. At the very
> very least all the archs that actually have a working cmpxchg
> unconditionally should be updated with:
> 
> 	select HAVE_FUTEX_CMPXCHG if FUTEX
> 
> so that whatever caused this for powerpc64 doesn't happen again.

By doing that you paper over a non functional fixup which could cause other
really hard to decode runtime failures. If that fails there is a bug
somewhere else and that runtime check is not to blame at all for it.

> > > Presumably it could also fail to produce -EFAULT if mmap_min_addr is 0
> > > and page 0 is mapped (a bad idea, but maybe someone does it...). And
> > > of course other nommu archs are possibly still broken.
> > 
> > If NULL is mapped in the kernel then a lot of other things are broken. The
> > futex thing is then the least of your worries.
> 
> For nommu NULL is always "mapped".

Cool, so you can have a NULL pointer dereference without noticing it.

> > The availibility of the interfaces which depend on futex_cmpxchg_enabled
> > has been runtime detectable forever and it's documented that way. I have no
> > idea why you think it's non-optional. If you made it unconditional in your
> > lib, then it's hardly the kernels problem.
> 
> Modern glibc also defines:
> 
> #define __ASSUME_SET_ROBUST_LIST        1
> 
> It's then #undef'd for some archs (mips, m68k, sparc, arm), but not
> all archs that lack HAVE_FUTEX_CMPXCHG, so glibc seems to be assuming
> set_robust_list actually works on some where the kernel is treating
> failure as a possibility.

That's hardly a kernel issue, right?

> > > If there are no archs that support SMP but don't provide their own
> > > asm/futex.h (as opposed to the asm-generic one that does -ENOSYS on
> > > SMP), the detection code should just be removed, and the SMP case in
> > > asm-generic/futex.h should be made into #error.
> > 
> > And why so? Just because?
> 
> To prevent inadvertent introduction of this issue on new archs (if the
> porters don't realize asm-generic/futex.h doesn't actually work for
> SMP).
> 
> > > If there are archs that support SMP but don't provide their own
> > > working asm/futex.h, then asm-generic/futex.h's SMP case should be
> > > enhanced to perform a stop-the-world IPI and then do the same thing as
> > > the non-SMP case (disable preemption[/interrupts?], perform the
> > > cmpxchg non-atomically).
> > > 
> > > Thoughts? Would a patch to do this be acceptable?
> > 
> > No. There is nothing at all in the world which requires that PI futexes and
> > robust list are provided and even if you implement that hack in the kernel
> > then the user space side still does not work because the user space part of
> > those interfaces has a hard dependency on working cmpxchg as well.
> 
> Of course there's a working cmpxchg; this is a hard requirement for
> userspace. You cannot implement pthread primitives without one. On
> archs/ISA-levels that lack an insn it's already provided as a syscall,
> a vdso/khelper, or trap-and-emulate. The problem I'm complaining about
> here is that, despite already needing and having working ways to
> achieve this, the kernel is not using them, and breaking functionality
> that should work.

I kinda agree for the nommu case, but for power64 you are barking up the
wrong tree. If that check fails something very fundamentaly is broken and
that breakage is _NOT_ in the futex code. This has to work independent of
the futex code, really.

Thanks,

	tglx


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ