lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20180916131637.GA17995@brightrain.aerifal.cx>
Date:   Sun, 16 Sep 2018 09:16:37 -0400
From:   Rich Felker <dalias@...c.org>
To:     Florian Weimer <fw@...eb.enyo.de>
Cc:     Thomas Gleixner <tglx@...utronix.de>, linux-kernel@...r.kernel.org,
        Peter Zijlstra <peterz@...radead.org>,
        Ingo Molnar <mingo@...nel.org>, linux-man@...r.kernel.org,
        "Michael Kerrisk (man-pages)" <mtk.manpages@...il.com>
Subject: Re: futex_cmpxchg_enabled breakage

On Sun, Sep 16, 2018 at 02:16:25PM +0200, Florian Weimer wrote:
> * Rich Felker:
> 
> > I just spent a number of hours helping someone track down a bug that
> > looks like it's some kind of futex_cmpxchg_enabled detection error on
> > powerpc64 (still not sure of the root cause; set_robust_list producing
> > -ENOSYS), and a while back I hit the same problem on sh2 due to lack
> > of EFAULT on nommu, leading to commit 72cc564f16ca. I think the test
> > (introduced way back in commit a0c1e9073ef7) is fundamentally buggy;
> > if anything, it should be checking for !=-ENOSYS, not ==-EFAULT.
> > Presumably it could also fail to produce -EFAULT if mmap_min_addr is 0
> > and page 0 is mapped (a bad idea, but maybe someone does it...). And
> > of course other nommu archs are possibly still broken.
> 
> Maybe it was related to this (“Kernel 4.15 lost set_robust_list
> support on POWER 9”):
> 
> <https://lists.ozlabs.org/pipermail/linuxppc-dev/2018-February/168570.html>

Thanks! This is really helpful; I'll pass it along.

> The Kconfig change you suggest was explicitly rejected as the fix.

Even if it was rejected as a fix for this specific bug, I think it
would be a good change for other reasons. If nothing else it reduces
kernel code size and eliminates a few instructions in some (albeit
long) hot paths.

> I believe the expected userspace interface is that you probe support
> with set_robust_list first, and then start using the relevant futex
> interfaces only if that call succeeded.

In order for it to work, set_robust_list needs to succeed for all
threads, present and future, so there's an implicit contract needed
here that, if it succeeds once, it needs to always succeed. This is
satisfied by the kernel implementation. FWIW I adopted a fix using
get_robust_list to probe just to avoid coupling the internals of
setting up the list with the bland function
pthread_mutexattr_setrobust which is unaware of any implementation
details except the location of the attribute bit.

Presumably a similar probing should happen in
pthread_mutexattr_setprotocol for PI mutex support. Does glibc do
this? musl still lacks PI mutex support so I'll save this as a note
for when it's added.

> If you do that, most parts of
> a typical system will work as expected even if the kernel support is
> not there, which is a bit surprising.  It definitely makes the root
> cause harder to spot.

I don't follow here. "most parts of a typical system will work as
expected" seems to be the case whether you do or don't correctly
probe. The only difference is whether a program that carefully checks
for errors will see and report that pthread_mutexattr_setrobust
failed.

Rich

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ