linux-kernel - Re: [patch 3/3] sched: move sched_avg_update() to update_cpu

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-Id: <1282265478.10905.1.camel@sbsiddha-MOBL3.sc.intel.com>
Date:	Thu, 19 Aug 2010 17:51:18 -0700
From:	Suresh Siddha <suresh.b.siddha@...el.com>
To:	Peter Zijlstra <peterz@...radead.org>
Cc:	"mingo@...e.hu" <mingo@...e.hu>,
	"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
	"chris@...stnet.net" <chris@...stnet.net>,
	"debian00@...ceadsl.fr" <debian00@...ceadsl.fr>,
	"hpa@...or.com" <hpa@...or.com>,
	"jonathan.protzenko@...il.com" <jonathan.protzenko@...il.com>,
	"mans@...sr.com" <mans@...sr.com>,
	"psastudio@...l.ru" <psastudio@...l.ru>,
	"rjw@...k.pl" <rjw@...k.pl>,
	"stephan.eicher@....de" <stephan.eicher@....de>,
	"sxxe@....de" <sxxe@....de>,
	"thomas@...hlinux.org" <thomas@...hlinux.org>,
	"venki@...gle.com" <venki@...gle.com>,
	"wonghow@...il.com" <wonghow@...il.com>
Subject: Re: [patch 3/3] sched: move sched_avg_update() to update_cpu_load()

On Mon, 2010-08-16 at 12:31 -0700, Peter Zijlstra wrote:
> On Mon, 2010-08-16 at 10:46 -0700, Suresh Siddha wrote:
> 
> > There is no guarantee that the original cpu won't be doing this in
> > parallel with nohz idle load balancing cpu.
> 
> Hmm, true.. bugger.
> 
> > > > Fix it by moving the sched_avg_update() to more appropriate update_cpu_load()
> > > > where the CFS load gets updated aswell.
> > > 
> > > Right, except it breaks things a bit, at the very least you really need
> > > that update right before reading it, otherwise you can end up with >100%
> > > fractions, which are odd indeed ;-)
> > 
> > with the patch, the update always happens before reading it. isn't it?
> > 
> > update now happens during the scheduler tick (or during nohz load
> > balancing tick). And the load balancer gets triggered with the tick.
> > So the update (at the tick) should happen before reading it (used by
> > load balancing triggered by the tick). Am I missing something?
> 
> We run the load-balancer in softirq context, on -rt that's a task, and
> we could have ran other (more important) RT tasks between the hardirq
> and the softirq running, which would increase the rt_avg and could thus
> result in >100%.
> 
> But I think we can simply retain the sched_avg_update(rq) in
> sched_rt_avg_update(), that is ran with rq->lock held and should be
> enough to avoid that case.
> 
> We can retain the other bit of you patch, moving sched_avg_update() from
> scale_rt_power() to update_cpu_load(), since that is only concerned with
> lowering the average when there is no actual activity.

Ok. Updated patch appended. Thanks.
---

From: Suresh Siddha <suresh.b.siddha@...el.com>
Subject: sched: move sched_avg_update() to update_cpu_load()

Currently sched_avg_update() (which updates rt_avg stats in the rq) is getting
called from scale_rt_power() (in the load balance context) which doesn't take
rq->lock.

Fix it by moving the sched_avg_update() to more appropriate update_cpu_load()
where the CFS load gets updated aswell.

Signed-off-by: Suresh Siddha <suresh.b.siddha@...el.com>
---
 kernel/sched.c      |    3 ++-
 kernel/sched_fair.c |    2 --
 2 files changed, 2 insertions(+), 3 deletions(-)

Index: tree/kernel/sched_fair.c
===================================================================
--- tree.orig/kernel/sched_fair.c
+++ tree/kernel/sched_fair.c
@@ -2268,8 +2268,6 @@ unsigned long scale_rt_power(int cpu)
 	struct rq *rq = cpu_rq(cpu);
 	u64 total, available;
 
-	sched_avg_update(rq);
-
 	total = sched_avg_period() + (rq->clock - rq->age_stamp);
 	available = total - rq->rt_avg;
 
Index: tree/kernel/sched.c
===================================================================
--- tree.orig/kernel/sched.c
+++ tree/kernel/sched.c
@@ -3182,6 +3182,8 @@ static void update_cpu_load(struct rq *t
 
 		this_rq->cpu_load[i] = (old_load * (scale - 1) + new_load) >> i;
 	}
+
+	sched_avg_update(this_rq);
 }
 
 static void update_cpu_load_active(struct rq *this_rq)


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/