[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20190626103558.GL3419@hirez.programming.kicks-ass.net>
Date: Wed, 26 Jun 2019 12:35:58 +0200
From: Peter Zijlstra <peterz@...radead.org>
To: Sebastian Andrzej Siewior <bigeasy@...utronix.de>
Cc: minyard@....org, linux-rt-users@...r.kernel.org,
Corey Minyard <cminyard@...sta.com>,
linux-kernel@...r.kernel.org, tglx@...utronix.de,
Steven Rostedt <rostedt@...dmis.org>
Subject: Re: [PATCH RT v2] Fix a lockup in wait_for_completion() and friends
On Thu, May 09, 2019 at 06:19:25PM +0200, Sebastian Andrzej Siewior wrote:
> One question for the upstream completion implementation:
> completion_done() returns true if there are no waiters. It acquires the
> wait.lock to ensure that complete()/complete_all() is done. However,
> once complete releases the lock it is guaranteed that the wake_up() (for
> the waiter) occurred. The waiter task still needs to be remove itself
> from the wait-queue before the completion can be removed.
> Do I miss something?
So you mean:
init_completion(&done);
wait_for_copmletion(&done)
spin_lock()
__add_wait_queue()
spin_unlock()
schedule()
complete()
completion_done()
spin_lock()
__remove_wait_queue()
spin_unlock()
Right?
I think that boils down to that whenever you have multiple waiters,
someone needs to be in charge of @done's lifetime.
The case that matters is:
DECLARE_COMPLETION_ONSTACK(done)
while (!completion_done(&done))
cpu_relax();
Where there is but a single waiter, and that waiter is
completion_done(). In that case it must not return early.
Now, I've also seen a ton of code do:
if (!completion_done(done))
complete(done);
And that makes me itch... but I've not bothered to look into it hard
enough.
Powered by blists - more mailing lists