[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <c737a604-d441-49c6-a5cd-ef01e9f2a454@kernel.org>
Date: Mon, 15 Jan 2024 13:54:32 +0100
From: Jiri Slaby <jirislaby@...nel.org>
To: Peter Zijlstra <peterz@...radead.org>,
Thomas Gleixner <tglx@...utronix.de>
Cc: Sebastian Andrzej Siewior <bigeasy@...utronix.de>,
linux-kernel@...r.kernel.org, boqun.feng@...il.com, bristot@...hat.com,
bsegall@...gle.com, dietmar.eggemann@....com, jstultz@...gle.com,
juri.lelli@...hat.com, longman@...hat.com, mgorman@...e.de,
mingo@...hat.com, rostedt@...dmis.org, swood@...hat.com,
vincent.guittot@...aro.org, vschneid@...hat.com, will@...nel.org
Subject: Re: [PATCH v3 7/7] locking/rtmutex: Acquire the hb lock via trylock
after wait-proxylock.
On 15. 01. 24, 12:52, Jiri Slaby wrote:
> On 15. 01. 24, 12:40, Jiri Slaby wrote:
>> On 15. 09. 23, 17:19, Peter Zijlstra wrote:
>>> On Fri, Sep 15, 2023 at 02:58:35PM +0200, Thomas Gleixner wrote:
>>>
>>>> I spent quite some time to convince myself that this is correct. I was
>>>> not able to poke a hole into it. So that really should be safe to
>>>> do. Famous last words ...
>>>
>>> IKR :-/
>>>
>>> Something like so then...
>>>
>>> ---
>>> Subject: futex/pi: Fix recursive rt_mutex waiter state
>>
>> So this breaks some random test in APR:
>>
>> From
>> https://build.opensuse.org/package/live_build_log/openSUSE:Factory:Staging:G/apr/standard/x86_64:
>> testprocmutex : Line 122: child did not terminate with success
>>
>> The child in fact terminates on
>> https://github.com/apache/apr/blob/trunk/test/testprocmutex.c#L93:
>> while ((rv = apr_proc_mutex_timedlock(proc_lock, 1))) {
>> if (!APR_STATUS_IS_TIMEUP(rv))
>> exit(1); <----- here
>>
>> The test creates 6 children and does some
>> pthread_mutex_timedlock/unlock() repeatedly (200 times) in parallel
>> while sleeping 1 us inside the lock. The timeout is 1 us above. And
>> the test expects all them to fail (to time out). But the time out does
>> not always happen in 6.7 (it's racy, so the failure is semi-random:
>> like 1 of 1000 attempts is bad).
>
> This is not precise as I misinterpreted. The test is: either it succeeds
> or times out.
>
> But since the commit, futex() yields 22/EINVAL, i.e. fails.
A simplified reproducer attached (in particular, no APR anymore). Build
with -pthread, obviously. If you see
BADx rv=22
that's bad.
regards,
--
js
suse labs
View attachment "pokus2.c" of type "text/x-csrc" (1744 bytes)
Powered by blists - more mailing lists