lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <AANLkTimH0YN0fd4zOk3A=_Q7waqM=6V4pw4N+0g0-k=7@mail.gmail.com>
Date:	Mon, 6 Dec 2010 15:59:36 +0800
From:	Yong Zhang <yong.zhang0@...il.com>
To:	Mike Galbraith <efault@....de>
Cc:	"Bjoern B. Brandenburg" <bbb.lst@...il.com>,
	Peter Zijlstra <peterz@...radead.org>,
	Ingo Molnar <mingo@...e.hu>,
	Andrea Bastoni <bastoni@...g.uniroma2.it>,
	"James H. Anderson" <anderson@...unc.edu>,
	linux-kernel@...r.kernel.org
Subject: Re: Scheduler bug related to rq->skip_clock_update?

On Mon, Dec 6, 2010 at 1:33 PM, Mike Galbraith <efault@....de> wrote:
> On Sun, 2010-12-05 at 13:28 +0800, Yong Zhang wrote:
>
>> when we init idle task, we doesn't mark it on_rq.
>> My test show the concern is smoothed by below patch.
>
> Close :)
>
> The skip_clock_update flag should only be set if rq->curr is on_rq,
> because it it _that_ clock update during dequeue, and subsequent
> microscopic vruntime update it causes that we're trying to avoid.
>
> I think the below fixes it up properly.

Yep. Now it's running well on my machine.

If you want, you can add my tested-by. :)

Thanks,
Yong

>
> Sched: fix skip_clock_update optimization
>
> idle_balance() drops/retakes rq->lock, leaving the previous task
> vulnerable to set_tsk_need_resched().  Clear it after we return
> from balancing instead, and in setup_thread_stack() as well, so
> no successfully descheduled or never scheduled task has it set.
>
> Need resched confused the skip_clock_update logic, which assumes
> that the next call to update_rq_clock() will come nearly immediately
> after being set.  Make the optimization robust against the waking
> a sleeper before it sucessfully deschedules case by checking that
> the current task has not been dequeued before setting the flag,
> since it is that useless clock update we're trying to save, and
> clear in update_rq_clock() to ensure that ONE call may be skipped.
>
> Signed-off-by: Mike Galbraith <efault@....de>
> Cc: Ingo Molnar <mingo@...e.hu>
> Cc: Peter Zijlstra <a.p.zijlstra@...llo.nl>
> Cc: Bjoern B. Brandenburg <bbb.lst@...il.com>
> Reported-by: Bjoern B. Brandenburg <bbb.lst@...il.com>
>
> ---
>  kernel/fork.c  |    1 +
>  kernel/sched.c |    6 +++---
>  2 files changed, 4 insertions(+), 3 deletions(-)
>
> Index: linux-2.6.37.git/kernel/sched.c
> ===================================================================
> --- linux-2.6.37.git.orig/kernel/sched.c
> +++ linux-2.6.37.git/kernel/sched.c
> @@ -660,6 +660,7 @@ inline void update_rq_clock(struct rq *r
>
>                sched_irq_time_avg_update(rq, irq_time);
>        }
> +       rq->skip_clock_update = 0;
>  }
>
>  /*
> @@ -2138,7 +2139,7 @@ static void check_preempt_curr(struct rq
>         * A queue event has occurred, and we're going to schedule.  In
>         * this case, we can save a useless back to back clock update.
>         */
> -       if (test_tsk_need_resched(rq->curr))
> +       if (rq->curr->se.on_rq && test_tsk_need_resched(rq->curr))
>                rq->skip_clock_update = 1;
>  }
>
> @@ -3854,7 +3855,6 @@ static void put_prev_task(struct rq *rq,
>  {
>        if (prev->se.on_rq)
>                update_rq_clock(rq);
> -       rq->skip_clock_update = 0;
>        prev->sched_class->put_prev_task(rq, prev);
>  }
>
> @@ -3912,7 +3912,6 @@ need_resched_nonpreemptible:
>                hrtick_clear(rq);
>
>        raw_spin_lock_irq(&rq->lock);
> -       clear_tsk_need_resched(prev);
>
>        switch_count = &prev->nivcsw;
>        if (prev->state && !(preempt_count() & PREEMPT_ACTIVE)) {
> @@ -3942,6 +3941,7 @@ need_resched_nonpreemptible:
>        if (unlikely(!rq->nr_running))
>                idle_balance(cpu, rq);
>
> +       clear_tsk_need_resched(prev);
>        put_prev_task(rq, prev);
>        next = pick_next_task(rq);
>
> Index: linux-2.6.37.git/kernel/fork.c
> ===================================================================
> --- linux-2.6.37.git.orig/kernel/fork.c
> +++ linux-2.6.37.git/kernel/fork.c
> @@ -275,6 +275,7 @@ static struct task_struct *dup_task_stru
>
>        setup_thread_stack(tsk, orig);
>        clear_user_return_notifier(tsk);
> +       clear_tsk_need_resched(tsk);
>        stackend = end_of_stack(tsk);
>        *stackend = STACK_END_MAGIC;    /* for overflow detection */
>
>
>
>
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ