linux-kernel - [patchlet] Re: Scheduler bug related to rq->skip_clock

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite for Android: free password hash cracker in your pocket

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-Id: <1291624329.9457.38.camel@marge.simson.net>
Date:	Mon, 06 Dec 2010 09:32:09 +0100
From:	Mike Galbraith <efault@....de>
To:	Yong Zhang <yong.zhang0@...il.com>
Cc:	"Bjoern B. Brandenburg" <bbb.lst@...il.com>,
	Peter Zijlstra <peterz@...radead.org>,
	Ingo Molnar <mingo@...e.hu>,
	Andrea Bastoni <bastoni@...g.uniroma2.it>,
	"James H. Anderson" <anderson@...unc.edu>,
	linux-kernel@...r.kernel.org
Subject: [patchlet] Re: Scheduler bug related to rq->skip_clock_update?

On Mon, 2010-12-06 at 15:59 +0800, Yong Zhang wrote:
> On Mon, Dec 6, 2010 at 1:33 PM, Mike Galbraith <efault@....de> wrote:
> > On Sun, 2010-12-05 at 13:28 +0800, Yong Zhang wrote:
> >
> >> when we init idle task, we doesn't mark it on_rq.
> >> My test show the concern is smoothed by below patch.
> >
> > Close :)
> >
> > The skip_clock_update flag should only be set if rq->curr is on_rq,
> > because it it _that_ clock update during dequeue, and subsequent
> > microscopic vruntime update it causes that we're trying to avoid.
> >
> > I think the below fixes it up properly.
> 
> Yep. Now it's running well on my machine.
> 
> If you want, you can add my tested-by. :)

Done.

Sched: fix skip_clock_update optimization

idle_balance() drops/retakes rq->lock, leaving the previous task
vulnerable to set_tsk_need_resched().  Clear it after we return
from balancing instead, and in setup_thread_stack() as well, so
no successfully descheduled or never scheduled task has it set.
 
Need resched confused the skip_clock_update logic, which assumes
that the next call to update_rq_clock() will come nearly immediately
after being set.  Make the optimization robust against the waking
a sleeper before it sucessfully deschedules case by checking that
the current task has not been dequeued before setting the flag,
since it is that useless clock update we're trying to save, and
clear in update_rq_clock() to ensure that ONE call may be skipped.

Signed-off-by: Mike Galbraith <efault@....de>
Cc: Ingo Molnar <mingo@...e.hu>
Cc: Peter Zijlstra <a.p.zijlstra@...llo.nl>
Cc: Bjoern B. Brandenburg <bbb.lst@...il.com>
Reported-by: Bjoern B. Brandenburg <bbb.lst@...il.com>
Tested-by: Yong Zhang <yong.zhang0@...il.com>

---
 kernel/fork.c  |    1 +
 kernel/sched.c |    6 +++---
 2 files changed, 4 insertions(+), 3 deletions(-)

Index: linux-2.6.37.git/kernel/sched.c
===================================================================
--- linux-2.6.37.git.orig/kernel/sched.c
+++ linux-2.6.37.git/kernel/sched.c
@@ -660,6 +660,7 @@ inline void update_rq_clock(struct rq *r
 
 		sched_irq_time_avg_update(rq, irq_time);
 	}
+	rq->skip_clock_update = 0;
 }
 
 /*
@@ -2138,7 +2139,7 @@ static void check_preempt_curr(struct rq
 	 * A queue event has occurred, and we're going to schedule.  In
 	 * this case, we can save a useless back to back clock update.
 	 */
-	if (test_tsk_need_resched(rq->curr))
+	if (rq->curr->se.on_rq && test_tsk_need_resched(rq->curr))
 		rq->skip_clock_update = 1;
 }
 
@@ -3854,7 +3855,6 @@ static void put_prev_task(struct rq *rq,
 {
 	if (prev->se.on_rq)
 		update_rq_clock(rq);
-	rq->skip_clock_update = 0;
 	prev->sched_class->put_prev_task(rq, prev);
 }
 
@@ -3912,7 +3912,6 @@ need_resched_nonpreemptible:
 		hrtick_clear(rq);
 
 	raw_spin_lock_irq(&rq->lock);
-	clear_tsk_need_resched(prev);
 
 	switch_count = &prev->nivcsw;
 	if (prev->state && !(preempt_count() & PREEMPT_ACTIVE)) {
@@ -3942,6 +3941,7 @@ need_resched_nonpreemptible:
 	if (unlikely(!rq->nr_running))
 		idle_balance(cpu, rq);
 
+	clear_tsk_need_resched(prev);
 	put_prev_task(rq, prev);
 	next = pick_next_task(rq);
 
Index: linux-2.6.37.git/kernel/fork.c
===================================================================
--- linux-2.6.37.git.orig/kernel/fork.c
+++ linux-2.6.37.git/kernel/fork.c
@@ -275,6 +275,7 @@ static struct task_struct *dup_task_stru
 
 	setup_thread_stack(tsk, orig);
 	clear_user_return_notifier(tsk);
+	clear_tsk_need_resched(tsk);
 	stackend = end_of_stack(tsk);
 	*stackend = STACK_END_MAGIC;	/* for overflow detection */
 



--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/