[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Mon, 10 Oct 2022 15:54:44 +0530
From: Mukesh Ojha <quic_mojha@...cinc.com>
To: Waiman Long <longman@...hat.com>
CC: <linux-kernel@...r.kernel.org>, <john.p.donnelly@...cle.com>,
Hillf Danton <hdanton@...a.com>,
Peter Zijlstra <peterz@...radead.org>,
Will Deacon <will@...nel.org>,
Boqun Feng <boqun.feng@...il.com>,
Ingo Molnar <mingo@...hat.com>
Subject: Re: [PATCH] locking/rwsem: Prevent non-first waiter from spinning in
down_write() slowpath
Hi Waiman,
On 9/29/2022 11:36 PM, Waiman Long wrote:
> On 9/29/22 14:04, Waiman Long wrote:
>> A non-first waiter can potentially spin in the for loop of
>> rwsem_down_write_slowpath() without sleeping but fail to acquire the
>> lock even if the rwsem is free if the following sequence happens:
>>
>> Non-first waiter First waiter Lock holder
>> ---------------- ------------ -----------
>> Acquire wait_lock
>> rwsem_try_write_lock():
>> Set handoff bit if RT or
>> wait too long
>> Set waiter->handoff_set
>> Release wait_lock
>> Acquire wait_lock
>> Inherit waiter->handoff_set
>> Release wait_lock
>> Clear owner
>> Release lock
>> if (waiter.handoff_set) {
>> rwsem_spin_on_owner(();
>> if (OWNER_NULL)
>> goto trylock_again;
>> }
>> trylock_again:
>> Acquire wait_lock
>> rwsem_try_write_lock():
>> if (first->handoff_set && (waiter != first))
>> return false;
>> Release wait_lock
>>
>> It is especially problematic if the non-first waiter is an RT task and
>> it is running on the same CPU as the first waiter as this can lead to
>> live lock.
>>
>> Fixes: d257cc8cb8d5 ("locking/rwsem: Make handoff bit handling more
>> consistent")
>> Signed-off-by: Waiman Long <longman@...hat.com>
>> ---
>> kernel/locking/rwsem.c | 13 ++++++++++---
>> 1 file changed, 10 insertions(+), 3 deletions(-)
>
> Mukesh, can you test if this patch can fix the RT task lockup problem?
>
Looks like, There is still a window for a race.
There is a chance when a reader who came first added it's BIAS and goes
to slowpath and before it gets added to wait list it got preempted by RT
task which goes to slowpath as well and being the first waiter gets its
hand-off bit set and not able to get the lock due to following condition
in rwsem_try_write_lock()
630 if (count & RWSEM_LOCK_MASK) { ==> reader has
sets its bias
..
...
634
635 new |= RWSEM_FLAG_HANDOFF;
636 } else {
637 new |= RWSEM_WRITER_LOCKED;
---------------------->----------------------->-------------------------
First reader (1) writer(2) RT task Lock holder(3)
It sets
RWSEM_READER_BIAS.
while it is going to
slowpath(as the lock
was held by (3)) and
before it got added
to the waiters list
it got preempted
by (2).
RT task also takes
the slowpath and add release the
itself into waiting list rwsem lock
and since it is the first clear the
it is the next one to get owner.
the lock but it can not
get the lock as (count &
RWSEM_LOCK_MASK) is set
as (1) has added it but
not able to remove its
adjustment.
-Mukesh
> Thanks,
> Longman
>
Powered by blists - more mailing lists