[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-Id: <1262592958.22471.104.camel@minggr.sh.intel.com>
Date: Mon, 04 Jan 2010 16:15:58 +0800
From: Lin Ming <ming.m.lin@...el.com>
To: Mike Galbraith <efault@....de>,
Peter Zijlstra <peterz@...radead.org>
Cc: lkml <linux-kernel@...r.kernel.org>,
"Zhang, Yanmin" <yanmin_zhang@...ux.intel.com>
Subject: volano ~30% regression with 2.6.33-rc1 & -rc2
Mike & Peter,
Compared with 2.6.32, volano has ~30% regression with 2.6.33-rc1 & -rc2.
Testing machine: Tigerton Xeon, 16cpus(4P/4Core), 16G memory
Bisect to below commit,
commit a1f84a3ab8e002159498814eaa7e48c33752b04b
Author: Mike Galbraith <efault@....de>
Date: Tue Oct 27 15:35:38 2009 +0100
sched: Check for an idle shared cache in select_task_rq_fair()
When waking affine, check for an idle shared cache, and if
found, wake to that CPU/sibling instead of the waker's CPU.
This improves pgsql+oltp ramp up by roughly 8%. Possibly more
for other loads, depending on overlap. The trade-off is a
roughly 1% peak downturn if tasks are truly synchronous.
Signed-off-by: Mike Galbraith <efault@....de>
Cc: Arjan van de Ven <arjan@...radead.org>
Cc: Peter Zijlstra <peterz@...radead.org>
Cc: <stable@...nel.org>
LKML-Reference: <1256654138.17752.7.camel@...ge.simson.net>
Signed-off-by: Ingo Molnar <mingo@...e.hu>
This commit can't be reverted due to conflict, so I reverted below 4
commits related to idle-shared-cache in 2.6.33-rc2, and then the
performance was restored to 2.6.32.
fe3bcfe (sched: More generic WAKE_AFFINE vs select_idle_sibling())
a50bde5 (sched: Cleanup select_task_rq_fair())
fd21073 (sched: Fix affinity logic in select_task_rq_fair())
a1f84a3 (sched: Check for an idle shared cache in select_task_rq_fair())
This regression seems caused by cache misses of access to per cpu data.
(see below perf top cache-misses data for detail)
select_idle_sibling(...)
{
....
for_each_cpu_and(i, sched_domain_span(sd), &p->cpus_allowed) {
if (!cpu_rq(i)->cfs.nr_running) {
target = i;
break;
}
}
....
}
The performance can be restored to 2.6.32 as well if SD_PREFER_SIBLING
is not set, so select_idle_sibling will not be called.
perf top data as follow,
2.6.33-rc1 cache-misses data (note 11.8% select_task_rq_fair)
------------------------------------------------------------------------------------
PerfTop: 12262 irqs/sec kernel:90.6% [1000Hz cache-misses], (all, 16 CPUs)
------------------------------------------------------------------------------------
samples pcnt function DSO
_______ _____ _____________________________ ________________
18272.00 11.8% select_task_rq_fair [kernel.kallsyms]
15499.00 10.0% schedule [kernel.kallsyms]
9447.00 6.1% update_curr [kernel.kallsyms]
9255.00 6.0% _raw_spin_lock [kernel.kallsyms]
5161.00 3.3% tcp_sendmsg [kernel.kallsyms]
2.6.32 cache-misses data
--------------------------------------------------------------------------------------
PerfTop: 11749 irqs/sec kernel:88.2% [1000Hz cache-misses], (all, 16 CPUs)
--------------------------------------------------------------------------------------
samples pcnt function DSO
_______ _____ _____________________________ _________________
11974.00 11.5% schedule [kernel.kallsyms]
6656.00 6.4% _spin_lock [kernel.kallsyms]
5852.00 5.6% update_curr [kernel.kallsyms]
3140.00 3.0% enqueue_entity [kernel.kallsyms]
2846.00 2.7% tcp_sendmsg [kernel.kallsyms]
2.6.33-rc1 cycles data (note 6.5% select_task_rq_fair)
-------------------------------------------------------------------------------
PerfTop: 11106 irqs/sec kernel:99.7% [1000Hz cycles], (all, 16 CPUs)
-------------------------------------------------------------------------------
samples pcnt function DSO
_______ _____ _________________________ _________________
11658.00 10.0% schedule [kernel.kallsyms]
10870.00 9.4% _raw_spin_lock [kernel.kallsyms]
7576.00 6.5% select_task_rq_fair [kernel.kallsyms]
3696.00 3.2% tcp_sendmsg [kernel.kallsyms]
3000.00 2.6% update_curr [kernel.kallsyms]
2.6.32 cycles data
------------------------------------------------------------------------------------
PerfTop: 10462 irqs/sec kernel:99.8% [1000Hz cycles], (all, 16 CPUs)
------------------------------------------------------------------------------------
samples pcnt function DSO
_______ _____ _________________________ _________________
13364.00 9.9% schedule [kernel.kallsyms]
13140.00 9.8% _spin_lock [kernel.kallsyms]
4903.00 3.6% tcp_sendmsg [kernel.kallsyms]
4017.00 3.0% update_curr [kernel.kallsyms]
3395.00 2.5% _spin_lock_bh [kernel.kallsyms]
Lin Ming
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists