[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <1282265478.10905.1.camel@sbsiddha-MOBL3.sc.intel.com>
Date: Thu, 19 Aug 2010 17:51:18 -0700
From: Suresh Siddha <suresh.b.siddha@...el.com>
To: Peter Zijlstra <peterz@...radead.org>
Cc: "mingo@...e.hu" <mingo@...e.hu>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
"chris@...stnet.net" <chris@...stnet.net>,
"debian00@...ceadsl.fr" <debian00@...ceadsl.fr>,
"hpa@...or.com" <hpa@...or.com>,
"jonathan.protzenko@...il.com" <jonathan.protzenko@...il.com>,
"mans@...sr.com" <mans@...sr.com>,
"psastudio@...l.ru" <psastudio@...l.ru>,
"rjw@...k.pl" <rjw@...k.pl>,
"stephan.eicher@....de" <stephan.eicher@....de>,
"sxxe@....de" <sxxe@....de>,
"thomas@...hlinux.org" <thomas@...hlinux.org>,
"venki@...gle.com" <venki@...gle.com>,
"wonghow@...il.com" <wonghow@...il.com>
Subject: Re: [patch 3/3] sched: move sched_avg_update() to update_cpu_load()
On Mon, 2010-08-16 at 12:31 -0700, Peter Zijlstra wrote:
> On Mon, 2010-08-16 at 10:46 -0700, Suresh Siddha wrote:
>
> > There is no guarantee that the original cpu won't be doing this in
> > parallel with nohz idle load balancing cpu.
>
> Hmm, true.. bugger.
>
> > > > Fix it by moving the sched_avg_update() to more appropriate update_cpu_load()
> > > > where the CFS load gets updated aswell.
> > >
> > > Right, except it breaks things a bit, at the very least you really need
> > > that update right before reading it, otherwise you can end up with >100%
> > > fractions, which are odd indeed ;-)
> >
> > with the patch, the update always happens before reading it. isn't it?
> >
> > update now happens during the scheduler tick (or during nohz load
> > balancing tick). And the load balancer gets triggered with the tick.
> > So the update (at the tick) should happen before reading it (used by
> > load balancing triggered by the tick). Am I missing something?
>
> We run the load-balancer in softirq context, on -rt that's a task, and
> we could have ran other (more important) RT tasks between the hardirq
> and the softirq running, which would increase the rt_avg and could thus
> result in >100%.
>
> But I think we can simply retain the sched_avg_update(rq) in
> sched_rt_avg_update(), that is ran with rq->lock held and should be
> enough to avoid that case.
>
> We can retain the other bit of you patch, moving sched_avg_update() from
> scale_rt_power() to update_cpu_load(), since that is only concerned with
> lowering the average when there is no actual activity.
Ok. Updated patch appended. Thanks.
---
From: Suresh Siddha <suresh.b.siddha@...el.com>
Subject: sched: move sched_avg_update() to update_cpu_load()
Currently sched_avg_update() (which updates rt_avg stats in the rq) is getting
called from scale_rt_power() (in the load balance context) which doesn't take
rq->lock.
Fix it by moving the sched_avg_update() to more appropriate update_cpu_load()
where the CFS load gets updated aswell.
Signed-off-by: Suresh Siddha <suresh.b.siddha@...el.com>
---
kernel/sched.c | 3 ++-
kernel/sched_fair.c | 2 --
2 files changed, 2 insertions(+), 3 deletions(-)
Index: tree/kernel/sched_fair.c
===================================================================
--- tree.orig/kernel/sched_fair.c
+++ tree/kernel/sched_fair.c
@@ -2268,8 +2268,6 @@ unsigned long scale_rt_power(int cpu)
struct rq *rq = cpu_rq(cpu);
u64 total, available;
- sched_avg_update(rq);
-
total = sched_avg_period() + (rq->clock - rq->age_stamp);
available = total - rq->rt_avg;
Index: tree/kernel/sched.c
===================================================================
--- tree.orig/kernel/sched.c
+++ tree/kernel/sched.c
@@ -3182,6 +3182,8 @@ static void update_cpu_load(struct rq *t
this_rq->cpu_load[i] = (old_load * (scale - 1) + new_load) >> i;
}
+
+ sched_avg_update(this_rq);
}
static void update_cpu_load_active(struct rq *this_rq)
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists