lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Mon, 16 Feb 2015 14:08:21 +0100
From:	Peter Zijlstra <peterz@...radead.org>
To:	Kirill Tkhai <tkhai@...dex.ru>
Cc:	Fengguang Wu <fengguang.wu@...el.com>,
	Ingo Molnar <mingo@...nel.org>, LKP <lkp@...org>,
	"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
	juri.lelli@....com
Subject: Re: [sched/deadline] kernel BUG at kernel/sched/deadline.c:805!

On Mon, Feb 16, 2015 at 03:38:34PM +0300, Kirill Tkhai wrote:
> We shouldn't enqueue migrating tasks. Please, try this one instead ;)

Ha, we should amend that task-rq-lock loop for that. See below.

I've not yet tested; going to try and reconstruct a .config that
triggers the oops.

---
Subject: sched/dl: Prevent enqueue of a sleeping task in dl_task_timer()
From: Kirill Tkhai <tkhai@...dex.ru>
Date: Mon, 16 Feb 2015 15:38:34 +0300

A deadline task may be throttled and dequeued at the same time.
This happens, when it becomes throttled in schedule(), which
is called to go to sleep:

current->state = TASK_INTERRUPTIBLE;
schedule()
    deactivate_task()
        dequeue_task_dl()
            update_curr_dl()
                start_dl_timer()
            __dequeue_task_dl()
    prev->on_rq = 0;

Later the timer fires, but the task is still dequeued:

dl_task_timer()
    enqueue_task_dl() /* queues on dl_rq; on_rq remains 0 */

Someone wakes it up:

try_to_wake_up()

    enqueue_dl_entity()
        BUG_ON(on_dl_rq())

Patch fixes this problem, it prevents queueing !on_rq tasks
on dl_rq.

Also teach the rq-lock loop about TASK_ON_RQ_MIGRATING as per
cca26e8009d1 ("sched: Teach scheduler to understand
TASK_ON_RQ_MIGRATING state").

Fixes: 1019a359d3dc ("sched/deadline: Fix stale yield state")
Cc: Ingo Molnar <mingo@...nel.org>
Cc: Juri Lelli <juri.lelli@....com>
Reported-by: Fengguang Wu <fengguang.wu@...el.com>
Signed-off-by: Kirill Tkhai <ktkhai@...allels.com>
[peterz: Wrote comment; fixed task-rq-lock loop]
Signed-off-by: Peter Zijlstra (Intel) <peterz@...radead.org>
Link: http://lkml.kernel.org/r/1374601424090314@web4j.yandex.ru
---
 kernel/sched/deadline.c |   25 ++++++++++++++++++++++---
 1 file changed, 22 insertions(+), 3 deletions(-)

--- a/kernel/sched/deadline.c
+++ b/kernel/sched/deadline.c
@@ -515,9 +515,8 @@ static enum hrtimer_restart dl_task_time
 again:
 	rq = task_rq(p);
 	raw_spin_lock(&rq->lock);
-
-	if (rq != task_rq(p)) {
-		/* Task was moved, retrying. */
+	if (rq != task_rq(p) || task_on_rq_migrating(p)) {
+		/* Task was move{d,ing}, retry */
 		raw_spin_unlock(&rq->lock);
 		goto again;
 	}
@@ -541,6 +540,26 @@ static enum hrtimer_restart dl_task_time
 
 	sched_clock_tick();
 	update_rq_clock(rq);
+
+	/*
+	 * If the throttle happened during sched-out; like:
+	 *
+	 *   schedule()
+	 *     deactivate_task()
+	 *       dequeue_task_dl()
+	 *         update_curr_dl()
+	 *           start_dl_timer()
+	 *         __dequeue_task_dl()
+	 *     prev->on_rq = 0;
+	 *
+	 * We can be both throttled and !queued. Replenish the counter
+	 * but do not enqueue -- wait for our wakeup to do that.
+	 */
+	if (!task_on_rq_queued(p)) {
+		replenish_dl_entity(dl_se, dl_se);
+		goto unlock;
+	}
+
 	enqueue_task_dl(rq, p, ENQUEUE_REPLENISH);
 	if (dl_task(rq->curr))
 		check_preempt_curr_dl(rq, p, 0);
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ