lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-Id: <20190509193320.21105-1-minyard@acm.org>
Date:   Thu,  9 May 2019 14:33:20 -0500
From:   minyard@....org
To:     linux-rt-users@...r.kernel.org
Cc:     minyard@....org, linux-kernel@...r.kernel.org,
        Sebastian Andrzej Siewior <bigeasy@...utronix.de>,
        Peter Zijlstra <peterz@...radead.org>, tglx@...utronix.de,
        Steven Rostedt <rostedt@...dmis.org>,
        Corey Minyard <cminyard@...sta.com>
Subject: [PATCH RT v2] Fix a lockup in wait_for_completion() and friends

From: Corey Minyard <cminyard@...sta.com>

The function call do_wait_for_common() has a race condition that
can result in lockups waiting for completions.  Adding the thread
to (and removing the thread from) the wait queue for the completion
is done outside the do loop in that function.  However, if the thread
is woken up, the swake_up_locked() function will delete the entry
from the wait queue.  If that happens and another thread sneaks
in and decrements the done count in the completion to zero, the
loop will go around again, but the thread will no longer be in the
wait queue, so there is no way to wake it up.

Visually, here's a diagram from Sebastian Andrzej Siewior:
  T0                    T1                       T2
  wait_for_completion()
   do_wait_for_common()
    __prepare_to_swait()
     schedule()
                        complete()
                         x->done++ (0 -> 1)
                         raw_spin_lock_irqsave()
                         swake_up_locked()       wait_for_completion()
                          wake_up_process(T0)
                          list_del_init()
                         raw_spin_unlock_irqrestore()
                                                  raw_spin_lock_irq(&x->wait.lock)
  raw_spin_lock_irq(&x->wait.lock)                x->done != UINT_MAX, 1 -> 0
                                                  raw_spin_unlock_irq(&x->wait.lock)
                                                  return 1
   while (!x->done && timeout),
   continue loop, not enqueued
   on &x->wait

Basically, the problem is that the original wait queues used in
completions did not remove the item from the queue in the wakeup
function, but swake_up_locked() does.

Fix it by adding the thread to the wait queue inside the do loop.
The design of swait detects if it is already in the list and doesn't
do the list add again.

Fixes: a04ff6b4ec4ee7e ("completion: Use simple wait queues")
Signed-off-by: Corey Minyard <cminyard@...sta.com>
---
Changes since v1:
* Only move __prepare_to_swait() into the loop.  __prepare_to_swait()
  handles if called when already in the list, and of course
  __finish_swait() handles if the item is not queued on the
  list.

 kernel/sched/completion.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/kernel/sched/completion.c b/kernel/sched/completion.c
index 755a58084978..49c14137988e 100644
--- a/kernel/sched/completion.c
+++ b/kernel/sched/completion.c
@@ -72,12 +72,12 @@ do_wait_for_common(struct completion *x,
 	if (!x->done) {
 		DECLARE_SWAITQUEUE(wait);
 
-		__prepare_to_swait(&x->wait, &wait);
 		do {
 			if (signal_pending_state(state, current)) {
 				timeout = -ERESTARTSYS;
 				break;
 			}
+			__prepare_to_swait(&x->wait, &wait);
 			__set_current_state(state);
 			raw_spin_unlock_irq(&x->wait.lock);
 			timeout = action(timeout);
-- 
2.17.1

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ