lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <87o8xm65ar.fsf@oldenburg2.str.redhat.com>
Date:   Fri, 08 Nov 2019 11:17:16 +0100
From:   Florian Weimer <fweimer@...hat.com>
To:     Thomas Gleixner <tglx@...utronix.de>
Cc:     LKML <linux-kernel@...r.kernel.org>,
        Peter Zijlstra <peterz@...radead.org>,
        Ingo Molnar <mingo@...nel.org>,
        Darren Hart <darren@...art.com>,
        Yi Wang <wang.yi59@....com.cn>,
        Yang Tao <yang.tao172@....com.cn>,
        Oleg Nesterov <oleg@...hat.com>,
        Carlos O'Donell <carlos@...hat.com>,
        Alexander Viro <viro@...iv.linux.org.uk>
Subject: Re: [patch 00/12] futex: Cure robust/PI futex exit races

* Thomas Gleixner:

> On Fri, 8 Nov 2019, Florian Weimer wrote:
>> * Florian Weimer:
>> > * Florian Weimer:
>> >> I ran the glibc upstream test suite (which has some robust futex tests)
>> >> against b21be7e942b49168ee15a75cbc49fbfdeb1e6a97 on x86-64, both native
>> >> and 32-bit/i386 compat mode.
>> >>
>> >> compat mode seems broken, nptl/tst-thread-affinity-pthread fails.  This
>> >> is probably *not* due to
>> >> <https://bugzilla.kernel.org/show_bug.cgi?id=154011> because the failure
>> >> is non-sporadic, but reliable fails for thread 253:
>> >>
>> >> info: Detected CPU set size (in bits): 225
>> >> info: Maximum test CPU: 255
>> >> error: pthread_create for thread 253 failed: Resource temporarily unavailable
>> >>
>> >> I'm running this on a large box as root, so ulimits etc. do not apply.
>> >>
>> >> I did not see this failure with the x86-64 test.
>> >>
>> >> You should be able to reproduce with (assuming you've got a multilib gcc):
>> >>
>> >> git clone git://sourceware.org/git/glibc.git git
>> >> mkdir build
>> >> cd build
>> >> ../git/configure --prefix=/usr CC="gcc -m32" CXX="g++ -m32" --build=i686-linux
>> >> make -j`nproc`
>> >> make test t=nptl/tst-thread-affinity-pthread
>> >
>> > Sorry, I realized that I didn't actually verify that this is a
>> > regression caused by your patches.  Maybe I can do that tomorrow.
>> 
>> Confirmed as a regression caused by the patches.  Depending on the
>> nature of the bug, you need a machine which has or pretends to have many
>> CPUs (this one has 256 CPUs).
>
> Sure I can do that, but I completely fail to see how that's a
> regression.
>
> Unpatched 5.4-rc6:
>
> FAIL: nptl/tst-thread-affinity-pthread
> original exit status 1
> info: Detected CPU set size (in bits): 225
> info: Maximum test CPU: 255
> error: pthread_create for thread 253 failed: Resource temporarily unavailable

Huh.  Reverting your patches (at commit 26bc672134241a080a83b2ab9aa8abede8d30e1c)
fixes the test for me.

> TBH, the futex changes have absolutely nothing to do with that resource
> fail.

I suspect that there are some changes to task exit latency, which
triggers the latent resource management bug.

Thanks,
Florian

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ