[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20070812155242.GA1977@elte.hu>
Date: Sun, 12 Aug 2007 17:52:44 +0200
From: Ingo Molnar <mingo@...e.hu>
To: Al Boldi <a1426z@...ab.com>
Cc: Peter Zijlstra <peterz@...radead.org>,
Mike Galbraith <efault@....de>,
Roman Zippel <zippel@...ux-m68k.org>,
Linus Torvalds <torvalds@...ux-foundation.org>,
Andrew Morton <akpm@...ux-foundation.org>,
linux-kernel@...r.kernel.org
Subject: Re: CFS review
* Al Boldi <a1426z@...ab.com> wrote:
> > so could you please re-check chew jitter behavior with the latest
> > kernel? (i've attached the standalone patch below, it will apply
> > cleanly to rc2 too.)
>
> That fixes it, but by reducing granularity ctx is up 4-fold.
ok, great! (the context-switch rate is obviously up.)
> Mind you, it does have an enormous effect on responsiveness, as
> negative nice with small granularity can't hijack the system any more.
ok. i'm glad you like the result :-) This makes reniced X (or any
reniced app) more usable.
> The thing is, this unpredictability seems to exist even at nice level
> 0, but the smaller granularity covers it all up. It occasionally
> exhibits itself as hick-ups during transient heavy workload flux. But
> it's not easily reproducible.
In general, "hickups" can be due to many, many reasons. If a task got
indeed delayed by scheduling jitter that is provable, even if the
behavior is hard to reproduce, by enabling CONFIG_SCHED_DEBUG=y and
CONFIG_SCHEDSTATS=y in your kernel. First clear all the stats:
for N in /proc/*/task/*/sched; do echo 0 > $N; done
then wait for the 'hickup' to happen, and once it happens capture the
system state (after the hickup) via this script:
http://people.redhat.com/mingo/cfs-scheduler/tools/cfs-debug-info.sh
and tell me which specific task exhibited that 'hickup' and send me the
debug output. Also, could you try the patch below as well? Thanks,
Ingo
-------------------------------->
Subject: sched: fix sleeper bonus
From: Ingo Molnar <mingo@...e.hu>
Peter Ziljstra noticed that the sleeper bonus deduction code
was not properly rate-limited: a task that scheduled more
frequently would get a disproportionately large deduction.
So limit the deduction to delta_exec.
Signed-off-by: Ingo Molnar <mingo@...e.hu>
---
kernel/sched_fair.c | 12 ++++++------
1 file changed, 6 insertions(+), 6 deletions(-)
Index: linux/kernel/sched_fair.c
===================================================================
--- linux.orig/kernel/sched_fair.c
+++ linux/kernel/sched_fair.c
@@ -75,7 +75,7 @@ enum {
unsigned int sysctl_sched_features __read_mostly =
SCHED_FEAT_FAIR_SLEEPERS *1 |
- SCHED_FEAT_SLEEPER_AVG *1 |
+ SCHED_FEAT_SLEEPER_AVG *0 |
SCHED_FEAT_SLEEPER_LOAD_AVG *1 |
SCHED_FEAT_PRECISE_CPU_LOAD *1 |
SCHED_FEAT_START_DEBIT *1 |
@@ -304,11 +304,9 @@ __update_curr(struct cfs_rq *cfs_rq, str
delta_mine = calc_delta_mine(delta_exec, curr->load.weight, lw);
if (cfs_rq->sleeper_bonus > sysctl_sched_granularity) {
- delta = calc_delta_mine(cfs_rq->sleeper_bonus,
- curr->load.weight, lw);
- if (unlikely(delta > cfs_rq->sleeper_bonus))
- delta = cfs_rq->sleeper_bonus;
-
+ delta = min(cfs_rq->sleeper_bonus, (u64)delta_exec);
+ delta = calc_delta_mine(delta, curr->load.weight, lw);
+ delta = min((u64)delta, cfs_rq->sleeper_bonus);
cfs_rq->sleeper_bonus -= delta;
delta_mine -= delta;
}
@@ -521,6 +519,8 @@ static void __enqueue_sleeper(struct cfs
* Track the amount of bonus we've given to sleepers:
*/
cfs_rq->sleeper_bonus += delta_fair;
+ if (unlikely(cfs_rq->sleeper_bonus > sysctl_sched_runtime_limit))
+ cfs_rq->sleeper_bonus = sysctl_sched_runtime_limit;
schedstat_add(cfs_rq, wait_runtime, se->wait_runtime);
}
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists