[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <13d683ed-793c-b502-44ff-f28114d9386b@redhat.com>
Date: Sun, 7 Nov 2021 10:24:31 -0500
From: Waiman Long <longman@...hat.com>
To: Hillf Danton <hdanton@...a.com>
Cc: 马振华 <mazhenhua@...omi.com>,
peterz <peterz@...radead.org>, mingo <mingo@...hat.com>,
will <will@...nel.org>, "boqun.feng" <boqun.feng@...il.com>,
linux-kernel <linux-kernel@...r.kernel.org>
Subject: Re: [BUG]locking/rwsem: only clean RWSEM_FLAG_HANDOFF when already
set
On 11/7/21 04:01, Hillf Danton wrote:
> On Sat, 6 Nov 2021 23:25:38 -0400 Waiman Long wrote:
>> On 11/6/21 08:39, 马振华 wrote:
>>> Dear longman,
>>>
>>> recently , i find a issue which rwsem count is negative value, it
>>> happened always when a task try to get the lock
>>> with __down_write_killable , then it is killed
>>>
>>> this issue happened like this
>>>
>>> CPU2 CPU4
>>> task A[reader] task B[writer]
>>> down_read_killable[locked]
>>> sem->count=0x100
>>> down_write_killable
>>> sem->count=0x102[wlist not empty]
>>> up_read
>>> count=0x2
>>> sig kill received
>>> down_read_killable
>>> sem->count=0x102[wlist not empty]
>>> goto branch out_nolock:
>>> list_del(&waiter.list);
>>> wait list is empty
>>> sem->count-RWSEM_FLAG_HANDOFF
>>> sem->count=0xFE
>>> list_empty(&sem->wait_list) is TRUE
>>> sem->count andnot RWSEM_FLAG_WAITERS
>>> sem->count=0xFC
>>> up_read
>>> sem->count -= 0x100
>>> sem->count=0xFFFFFFFFFFFFFFFC
>>> DEBUG_RWSEMS_WARN_ON(tmp < 0, sem);
>>>
>>> so sem->count will be negative after writer is killed
>>> i think if flag RWSEM_FLAG_HANDOFF is not set, we shouldn't clean it
>> Thanks for reporting this possible race condition.
>>
>> However, I am still trying to figure how it is possible to set the
>> wstate to WRITER_HANDOFF without actually setting the handoff bit as
>> well. The statement sequence should be as follows:
>>
>> wstate = WRITER_HANDOFF;
>> raw_spin_lock_irq(&sem->wait_lock);
>> if (rwsem_try_write_lock(sem, wstate))
>> raw_spin_unlock_irq(&sem->wait_lock);
>> :
>> if (signal_pending_state(state, current))
>> goto out_nolock
>>
>> The rwsem_try_write_lock() function will make sure that we either
>> acquire the lock and clear handoff or set the handoff bit. This should
>> be done before we actually check for signal. I do think that it is
> Given that WRITER_HANDOFF makes no sure that RWSEM_FLAG_HANDOFF is set in
> wsem_try_write_lock(), the flag should be cleared only by the setter to
> avoid count underflow.
>
> Hillf
>
>> probably safer to use atomic_long_andnot to clear the handoff bit
>> instead of using atomic_long_add().
I did have a tentative patch to address this issue which is somewhat
similar to your approach. However, I would like to further investigate
the exact mechanics of the race condition to make sure that I won't miss
a latent bug somewhere else in the rwsem code.
-Longman
diff --git a/kernel/locking/rwsem.c b/kernel/locking/rwsem.c
index c51387a43265..20be967620c0 100644
--- a/kernel/locking/rwsem.c
+++ b/kernel/locking/rwsem.c
@@ -347,7 +347,8 @@ enum rwsem_wake_type {
enum writer_wait_state {
WRITER_NOT_FIRST, /* Writer is not first in wait list */
WRITER_FIRST, /* Writer is first in wait list */
- WRITER_HANDOFF /* Writer is first & handoff needed */
+ WRITER_NEED_HANDOFF /* Writer is first & handoff needed */
+ WRITER_HANDOFF /* Writer is first & handoff set */
};
/*
@@ -532,11 +533,11 @@ static void rwsem_mark_wake(struct rw_semaphore *sem,
* race conditions between checking the rwsem wait list and setting the
* sem->count accordingly.
*
- * If wstate is WRITER_HANDOFF, it will make sure that either the handoff
+ * If wstate is WRITER_NEED_HANDOFF, it will make sure that either the
handoff
* bit is set or the lock is acquired with handoff bit cleared.
*/
static inline bool rwsem_try_write_lock(struct rw_semaphore *sem,
- enum writer_wait_state wstate)
+ enum writer_wait_state *wstate)
{
long count, new;
@@ -546,13 +547,13 @@ static inline bool rwsem_try_write_lock(struct
rw_semaphore *sem,
do {
bool has_handoff = !!(count & RWSEM_FLAG_HANDOFF);
- if (has_handoff && wstate == WRITER_NOT_FIRST)
+ if (has_handoff && *wstate == WRITER_NOT_FIRST)
return false;
new = count;
if (count & RWSEM_LOCK_MASK) {
- if (has_handoff || (wstate != WRITER_HANDOFF))
+ if (has_handoff || (*wstate != WRITER_NEED_HANDOFF))
return false;
new |= RWSEM_FLAG_HANDOFF;
@@ -569,8 +570,10 @@ static inline bool rwsem_try_write_lock(struct
rw_semaphore *sem,
* We have either acquired the lock with handoff bit cleared or
* set the handoff bit.
*/
- if (new & RWSEM_FLAG_HANDOFF)
+ if (new & RWSEM_FLAG_HANDOFF) {
+ *wstate = WRITER_HANDOFF;
return false;
+ }
rwsem_set_owner(sem);
return true;
@@ -1083,7 +1086,7 @@ rwsem_down_write_slowpath(struct rw_semaphore
*sem, int state)
/* wait until we successfully acquire the lock */
set_current_state(state);
for (;;) {
- if (rwsem_try_write_lock(sem, wstate)) {
+ if (rwsem_try_write_lock(sem, &wstate)) {
/* rwsem_try_write_lock() implies ACQUIRE on success */
break;
}
@@ -1138,7 +1141,7 @@ rwsem_down_write_slowpath(struct rw_semaphore
*sem, int state)
*/
if ((wstate == WRITER_FIRST) && (rt_task(current) ||
time_after(jiffies, waiter.timeout))) {
- wstate = WRITER_HANDOFF;
+ wstate = WRITER_NEED_HANDOFF;
lockevent_inc(rwsem_wlock_handoff);
break;
}
@@ -1159,7 +1162,7 @@ rwsem_down_write_slowpath(struct rw_semaphore
*sem, int state)
list_del(&waiter.list);
if (unlikely(wstate == WRITER_HANDOFF))
- atomic_long_add(-RWSEM_FLAG_HANDOFF, &sem->count);
+ atomic_long_addnot(RWSEM_FLAG_HANDOFF, &sem->count);
if (list_empty(&sem->wait_list))
atomic_long_andnot(RWSEM_FLAG_WAITERS, &sem->count);
--
Powered by blists - more mailing lists