[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <1256654138.17752.7.camel@marge.simson.net>
Date: Tue, 27 Oct 2009 15:35:38 +0100
From: Mike Galbraith <efault@....de>
To: Peter Zijlstra <peterz@...radead.org>
Cc: Arjan van de Ven <arjan@...radead.org>, mingo@...e.hu,
linux-kernel@...r.kernel.org
Subject: Re: [PATCH 3/3] sched: Disable affine wakeups by default
On Mon, 2009-10-26 at 02:53 +0100, Peter Zijlstra wrote:
> On Sun, 2009-10-25 at 23:04 +0100, Mike Galbraith wrote:
> > if (want_affine && (tmp->flags & SD_WAKE_AFFINE) &&
> > - cpumask_test_cpu(prev_cpu, sched_domain_span(tmp))) {
> > + (level == SD_LV_SIBLING || level == SD_LV_MC)) {
>
> quick comment without actually having looked at the patch, we should
> really get rid of sd->level and encode properties of the sched domains
> in sd->flags.
I used SD_PREFER_SIBLING in the below. Did I break anything?
(wonder what it does for pgsql+oltp on beefy box with siblings)
tip v2.6.32-rc5-1724-g77a088c
mysql+oltp
clients 1 2 4 8 16 32 64 128 256
tip 9999.77 18472.11 34931.60 34412.09 33006.76 32104.36 30700.47 28111.31 25535.09
10082.75 18625.12 34928.17 34476.91 33088.70 32002.36 30695.77 28173.94 25551.05
9949.05 18466.54 34942.66 34420.74 33092.45 32041.10 30666.43 28090.90 25467.63
tip avg 10010.52 18521.25 34934.14 34436.58 33062.63 32049.27 30687.55 28125.38 25517.92
tip+ 9622.23 18297.65 34496.12 34230.85 32704.20 31796.54 30480.45 27740.20 25394.12
10207.79 18275.83 34622.39 34222.47 32996.69 31936.48 30551.29 28144.48 25616.62
10225.32 18515.02 34538.41 34278.06 33014.14 31965.31 30363.90 28089.41 25531.81
tip+ avg 10018.44 18362.83 34552.30 34243.79 32905.01 31899.44 30465.21 27991.36 25514.18
vs tip 1.000 .991 .989 .994 .995 .995 .992 .995 .999
pgsql+oltp
clients 1 2 4 8 16 32 64 128 256
tip 13945.42 26973.91 52504.18 52613.32 51310.82 50442.61 49826.52 48760.62 45570.45
13921.41 27021.48 52722.64 52565.16 51483.19 50638.83 49499.51 48621.31 46115.77
13924.94 26961.02 52624.45 52365.49 51384.91 50499.44 49622.83 48065.03 45743.14
tip avg 13930.59 26985.47 52617.09 52514.65 51392.97 50526.96 49649.62 48482.32 45809.78
tip+ 15259.79 29162.31 52609.01 52562.16 51578.48 50631.90 49537.41 48376.23 46058.95
15156.54 29114.10 52760.02 52524.86 51412.94 50656.30 48774.34 47968.77 45905.02
15118.64 29190.73 52929.34 52503.58 51574.34 50232.27 49599.15 48283.42 45766.74
tip+ avg 15178.32 29155.71 52766.12 52530.20 51521.92 50506.82 49303.63 48209.47 45910.23
vs tip 1.089 1.080 1.002 1.000 1.002 .999 .993 .994 1.002
sched: check for an idle shared cache in select_task_rq_fair()
When waking affine, check for an idle shared cache, and if found, wake to
that CPU/sibling instead of the waker's CPU. This improves pgsql+oltp
ramp up by roughly 8%. Possibly more for other loads, depending on overlap.
The trade-off is a roughly 1% peak downturn if tasks are truly synchronous.
Signed-off-by: Mike Galbraith <efault@....de>
Cc: Ingo Molnar <mingo@...e.hu>
Cc: Peter Zijlstra <a.p.zijlstra@...llo.nl>
LKML-Reference: <new-submission>
---
kernel/sched_fair.c | 33 +++++++++++++++++++++++++++++----
1 file changed, 29 insertions(+), 4 deletions(-)
Index: linux-2.6/kernel/sched_fair.c
===================================================================
--- linux-2.6.orig/kernel/sched_fair.c
+++ linux-2.6/kernel/sched_fair.c
@@ -1398,11 +1398,36 @@ static int select_task_rq_fair(struct ta
want_sd = 0;
}
- if (want_affine && (tmp->flags & SD_WAKE_AFFINE) &&
- cpumask_test_cpu(prev_cpu, sched_domain_span(tmp))) {
+ if (want_affine && (tmp->flags & SD_WAKE_AFFINE)) {
+ int candidate = -1, i;
- affine_sd = tmp;
- want_affine = 0;
+ if (cpumask_test_cpu(prev_cpu, sched_domain_span(tmp)))
+ candidate = cpu;
+
+ /*
+ * Check for an idle shared cache.
+ */
+ if (tmp->flags & SD_PREFER_SIBLING) {
+ if (candidate == cpu) {
+ if (!cpu_rq(prev_cpu)->cfs.nr_running)
+ candidate = prev_cpu;
+ }
+
+ if (candidate == -1 || candidate == cpu) {
+ for_each_cpu(i, sched_domain_span(tmp)) {
+ if (!cpu_rq(i)->cfs.nr_running) {
+ candidate = i;
+ break;
+ }
+ }
+ }
+ }
+
+ if (candidate >= 0) {
+ affine_sd = tmp;
+ want_affine = 0;
+ cpu = candidate;
+ }
}
if (!want_sd && !want_affine)
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists