lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Thu, 02 Jun 2011 17:48:31 +0200
From:	Peter Zijlstra <a.p.zijlstra@...llo.nl>
To:	Yong Zhang <yong.zhang0@...il.com>
Cc:	Borislav Petkov <bp@...64.org>, Borislav Petkov <bp@...en8.de>,
	"mingo@...hat.com" <mingo@...hat.com>,
	"hpa@...or.com" <hpa@...or.com>,
	"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
	"markus@...ppelsdorf.de" <markus@...ppelsdorf.de>,
	"tglx@...utronix.de" <tglx@...utronix.de>,
	"mingo@...e.hu" <mingo@...e.hu>,
	"linux-tip-commits@...r.kernel.org" 
	<linux-tip-commits@...r.kernel.org>
Subject: Re: [tip:sched/urgent] sched: Fix cross-cpu clock sync on remote
 wakeups

On Thu, 2011-06-02 at 22:23 +0800, Yong Zhang wrote:
> On Thu, Jun 02, 2011 at 03:04:26PM +0200, Peter Zijlstra wrote:
> > On Thu, 2011-06-02 at 15:52 +0800, Yong Zhang wrote:
> > > In sched_clock_local(), clock is calculated around ->tick_gtod even if
> > > that ->tick_gtod is stale for long time because we stays in idle state. 
> > > You know ->tick_gtod is only updated in sched_clock_tick();
> > 
> > (well, no, there's idle callbacks as you said below)
> > 
> > > IOW, when a cpu goes out of idle, sched_clock_tick() is called from
> > > tick_nohz_stop_idle() which is later than interrupt.
> > 
> > Gah, that would be awefull and mean wakeups from interrupts were already
> > borken. /me goes look at code.
> > 
> > irq_enter() -> tick_check_idle() -> tick_check_nohz() ->
> > tick_nohz_stop_idle() -> sched_clock_idle_wakeup_event()
> > 
> > should update the thing before we run any isrs, right?
> 
> Hmmm, you are right.
> 
> But smp_reschedule_interrupt() doesn't call irq_enter()/irq_exit(),
> is that correct?

Crap.. you're right. And I bet other archs don't do that either. With
NO_HZ you really need irq_enter() for pretty much all interrupts so I
was assuming the resched IPI had it, but its been special and never
really needed it. If it would wake an idle cpu the idle loop exit would
deal with it, if it interrupted userspace the thing was running and
NO_HZ wasn't relevant.

Damn. 

And yes, the only reason I didn't see this on my dev box was because we
do indeed set that sched_clock_stable thing on wsm. And I never noticed
on my desktop because firefox/X/etc. consuming heaps of CPU isn't weird
at all.

Adding it to all resched int handlers is of course a possibility but
would slow down the thing, although with the new code, most users are
now indeed wakeups (excepting weird and wonderful users like KVM).

We could of course add it in sched.c since the logic recurses just
fine.. its not pretty though.. :/

Thoughts?

---
 kernel/sched.c |   18 +++++++++++++++++-
 1 files changed, 17 insertions(+), 1 deletions(-)

diff --git a/kernel/sched.c b/kernel/sched.c
index 2fe98ed..365ed6b 100644
--- a/kernel/sched.c
+++ b/kernel/sched.c
@@ -2554,7 +2554,23 @@ static void sched_ttwu_pending(void)
 
 void scheduler_ipi(void)
 {
-	sched_ttwu_pending();
+	struct rq *rq = this_rq();
+	struct task_struct *list = xchg(&rq->wake_list, NULL);
+
+	if (!list)
+		return;
+
+	irq_enter();
+	raw_spin_lock(&rq->lock);
+
+	while (list) {
+		struct task_struct *p = list;
+		list = list->wake_entry;
+		ttwu_do_activate(rq, p, 0);
+	}
+
+	raw_spin_unlock(&rq->lock);
+	irq_exit();
 }
 
 static void ttwu_queue_remote(struct task_struct *p, int cpu)


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ