[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <6a65bfcc-661a-5ca5-86d6-0b4614f6aeea@colorfullife.com>
Date: Fri, 27 Sep 2019 06:59:03 +0200
From: Manfred Spraul <manfred@...orfullife.com>
To: Waiman Long <longman@...hat.com>,
Peter Zijlstra <peterz@...radead.org>
Cc: Andrew Morton <akpm@...ux-foundation.org>,
linux-kernel@...r.kernel.org, Arnd Bergmann <arnd@...db.de>,
"Gustavo A. R. Silva" <gustavo@...eddedor.com>,
Catalin Marinas <catalin.marinas@....com>,
Mathieu Malaterre <malat@...ian.org>,
Davidlohr Bueso <dave@...olabs.net>
Subject: Re: [PATCH] ipc/sem: Fix race between to-be-woken task and waker
Hi,
On 9/26/19 8:12 PM, Waiman Long wrote:
> On 9/26/19 5:34 AM, Peter Zijlstra wrote:
>> On Fri, Sep 20, 2019 at 11:54:02AM -0400, Waiman Long wrote:
>>> While looking at a customr bug report about potential missed wakeup in
>>> the system V semaphore code, I spot a potential problem. The fact that
>>> semaphore waiter stays in TASK_RUNNING state while checking queue status
>>> may lead to missed wakeup if a spurious wakeup happens in the right
>>> moment as try_to_wake_up() will do nothing if the task state isn't right.
>>>
>>> To eliminate this possibility, the task state is now reset to
>>> TASK_INTERRUPTIBLE immediately after wakeup before checking the queue
>>> status. This should eliminate the race condition on the interaction
>>> between the queue status and the task state and fix the potential missed
>>> wakeup problem.
You are obviously right, there is a huge race condition.
>> Bah, this code always makes my head hurt.
>>
>> Yes, AFAICT the pattern it uses has been broken since 0a2b9d4c7967,
>> since that removed doing the actual wakeup from under the sem_lock(),
>> which is what it relies on.
Correct - I've overlooked that.
First, theory:
setting queue->status, reading queue->status, setting
current->state=TASK_INTERRUPTIBLE are all under the correct spinlock.
(there is an opportunistic read of queue->status without locks, but it
is retried when the lock got acquired)
setting current->state=RUNNING is outside of any lock.
So as far as current->state is concerned, the lock doesn't exist. And if
the lock doesn't exist, we must follow the rules applicable for
set_current_state().
I'll try to check the code this week.
And we should check the remaining wake-queue users, the logic is
everywhere identical.
> After having a second look at the code again, I probably misread the
> code the first time around. In the sleeping path, there is a check of
> queue.status and setting of task state both under the sem lock in the
> sleeping path. So as long as setting of queue status is under lock, they
> should synchronize properly.
>
> It looks like queue status setting is under lock, but I can't use
> lockdep to confirm that as the locking can be done by either the array
> lock or in one of the spinlocks in the array. Are you aware of a way of
> doing that?
For testing? Have you considered just always using the global lock?
(untested):
--- a/ipc/sem.c
+++ b/ipc/sem.c
@@ -370,7 +370,7 @@ static inline int sem_lock(struct sem_array *sma,
struct sembuf *sops,
struct sem *sem;
int idx;
- if (nsops != 1) {
+ if (nsops != 1 || 1) {
/* Complex operation - acquire a full lock */
ipc_lock_object(&sma->sem_perm);
> Anyway, I do think we need to add some comment to clarify the situation
> to avoid future confusion.
Around line 190 is the comment that explains locking & memory ordering.
I have only documented the content of sem_undo and sem_array, but
neither queue nor current->state :-(
--
Manfred
Powered by blists - more mailing lists