linux-kernel - Re: [PATCH 1/4] locking/ww_mutex: Fix a deadlock affecting ww

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <b1320ee3-1d68-8ebe-03ab-b491b04b5e6f@gmail.com>
Date:   Thu, 24 Nov 2016 12:26:57 +0100
From:   Nicolai Hähnle <nhaehnle@...il.com>
To:     Peter Zijlstra <peterz@...radead.org>,
        Nicolai Hähnle <Nicolai.Haehnle@....com>,
        linux-kernel@...r.kernel.org, dri-devel@...ts.freedesktop.org,
        Ingo Molnar <mingo@...hat.com>, stable@...r.kernel.org,
        Maarten Lankhorst <maarten.lankhorst@...onical.com>
Subject: Re: [PATCH 1/4] locking/ww_mutex: Fix a deadlock affecting ww_mutexes

On 23.11.2016 15:25, Daniel Vetter wrote:
> On Wed, Nov 23, 2016 at 03:03:36PM +0100, Peter Zijlstra wrote:
>> On Wed, Nov 23, 2016 at 12:25:22PM +0100, Nicolai Hähnle wrote:
>>> @@ -473,7 +476,14 @@ void __sched ww_mutex_unlock(struct ww_mutex *lock)
>>>  	 */
>>>  	mutex_clear_owner(&lock->base);
>>>  #endif
>>> -	__mutex_fastpath_unlock(&lock->base.count, __mutex_unlock_slowpath);
>>> +	/*
>>> +	 * A previously _not_ waiting task may acquire the lock via the fast
>>> +	 * path during our unlock. In that case, already waiting tasks may have
>>> +	 * to back off to avoid a deadlock. Wake up all waiters so that they
>>> +	 * can check their acquire context stamp against the new owner.
>>> +	 */
>>> +	__mutex_fastpath_unlock(&lock->base.count,
>>> +				__mutex_unlock_slowpath_wakeall);
>>>  }
>>
>> So doing a wake-all has obvious issues with thundering herd etc.. Also,
>> with the new mutex, you'd not be able to do hand-off, which would
>> introduce starvation cases.
>>
>> Ideally we'd iterate the blocked list and pick the waiter with the
>> earliest stamp, or we'd maintain the list in stamp order instead of
>> FIFO, for ww_mutex.
>
> Not sure we'll win that much, at least I think we still need to wake up
> everyone with earlier stamp than the one of the task that just released
> the lock. Otherwise there's deadlocks. So just cuts the wakeups in half,
> on average.
>
> What we could do is do a handoff-hint with the timestamp of earliest task
> we believe should get the lock. Everyone with a later timestamp that gets
> woken then knows that they definitely have a deadlock situation and need
> to back off (thread 2 in the example).
>
> thread 1 would get woken, and would be able to take the lock, except when
> thread 0 successfully raced it and stole the lock. And anyone else racing
> in with later timestamps would also immediately back off, ensuring
> fairness.

I did consider maintaining stamp order in the waiter list and originally 
decided against it because I just wanted a simple and conservative fix 
to avoid adding new regressions.

Now that a different approach is needed for >= 4.9 anyway mostly due to 
the hand-off logic, I'm reconsidering this.

I do believe we can win a bit by keeping the wait list sorted, if we 
also make sure that waiters don't add themselves in the first place if 
they see that a deadlock situation cannot be avoided.

I will probably want to extend struct mutex_waiter with 
ww_mutex-specific fields to facilitate this (i.e. ctx pointer, perhaps 
stamp as well to reduce pointer-chasing). That should be fine since it 
lives on the stack.

In the meantime, I'd appreciate it if patch #1 could be accepted as-is 
for stable updates to <= 4.8. It fixes a real (if rare) bug, and the 
stampede inefficiency isn't a problem in practice at least for GPU 
applications.

Thanks,
Nicolai

>
> Without thinking it through in detail this is a PI issue, except that we
> replace boosting with wakeup&back-off. Could we perhaps steal something
> from rt mutexes to make it fair&efficient?
> -Daniel
>