[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <483C8BA6.8090309@qualcomm.com>
Date: Tue, 27 May 2008 15:31:02 -0700
From: Max Krasnyanskiy <maxk@...lcomm.com>
To: mingo@...e.hu
CC: pj@....com, a.p.zijlstra@...llo.nl, linux-kernel@...r.kernel.org,
menage@...gle.com, rostedt@...dmis.org
Subject: Re: [PATCH] [sched] Fixed CPU hotplug and sched domain handling
Max Krasnyansky wrote:
> First issue is that we're leaking doms_cur. It's allocated in
> arch_init_sched_domains() which is called for every hotplug event.
> So we just keep reallocation doms_cur without freeing it.
> I introduced free_sched_domains() function that cleans things up.
>
> Second issue is that sched domains created by the cpusets are
> completely destroyed by the CPU hotplug events. For all CPU hotplug
> events scheduler attaches all CPUs to the NULL domain and then puts
> them all into the single domain thereby destroying domains created
> by the cpusets (partition_sched_domains).
> The solution is simple, when cpusets are enabled scheduler should not
> create default domain and instead let cpusets do that. Which is
> exactly what the patch does.
Here is more info on this, with debug logs.
Here is initial cpuset setup.
cpus 0-3 balanced, cpus 4-7 non-balanced
cd /dev/cgroup
echo 0 > cpusets.sched_load_balance
mkdir boot
echo 0-3 > boot/cpusets.cpus
echo 1 > boot/cpusets.sched_load_balance
...
-----
CPU0 attaching NULL sched-domain.
CPU1 attaching NULL sched-domain.
CPU2 attaching NULL sched-domain.
CPU3 attaching NULL sched-domain.
CPU4 attaching NULL sched-domain.
CPU5 attaching NULL sched-domain.
CPU6 attaching NULL sched-domain.
CPU7 attaching NULL sched-domain.
CPU0 attaching sched-domain:
domain 0: span 0f
groups: 01 02 04 08
CPU1 attaching sched-domain:
domain 0: span 0f
groups: 02 04 08 01
CPU2 attaching sched-domain:
domain 0: span 0f
groups: 04 08 01 02
CPU3 attaching sched-domain:
domain 0: span 0f
groups: 08 01 02 04
-----
Looks good so far.
Now lets bring cpu7 offline (echo 0 > /sys/devices/system/cpu/cpu7/online)
-----
CPU0 attaching NULL sched-domain.
CPU1 attaching NULL sched-domain.
CPU2 attaching NULL sched-domain.
CPU3 attaching NULL sched-domain.
CPU4 attaching NULL sched-domain.
CPU5 attaching NULL sched-domain.
CPU6 attaching NULL sched-domain.
CPU7 attaching NULL sched-domain.
CPU 7 is now offline
CPU0 attaching sched-domain:
domain 0: span 11
groups: 01 10
domain 1: span 7f
groups: 11 22 44 08
CPU1 attaching sched-domain:
domain 0: span 22
groups: 02 20
domain 1: span 7f
groups: 22 44 08 11
CPU2 attaching sched-domain:
domain 0: span 44
groups: 04 40
domain 1: span 7f
groups: 44 08 11 22
CPU3 attaching sched-domain:
domain 0: span 7f
groups: 08 11 22 44
CPU4 attaching sched-domain:
domain 0: span 11
groups: 10 01
domain 1: span 7f
groups: 11 22 44 08
CPU5 attaching sched-domain:
domain 0: span 22
groups: 20 02
domain 1: span 7f
groups: 22 44 08 11
CPU6 attaching sched-domain:
domain 0: span 44
groups: 40 04
domain 1: span 7f
groups: 44 08 11 22
----
All cpus are now in the single domain.
Same thing happens when cpu7 comes back online.
----
CPU0 attaching NULL sched-domain.
CPU1 attaching NULL sched-domain.
CPU2 attaching NULL sched-domain.
CPU3 attaching NULL sched-domain.
CPU4 attaching NULL sched-domain.
CPU5 attaching NULL sched-domain.
CPU6 attaching NULL sched-domain.
Booting processor 7/8 APIC 0x7
Initializing CPU#7
Calibrating delay using timer specific routine.. 4655.39 BogoMIPS (lpj=9310785)
CPU: L1 I cache: 32K, L1 D cache: 32K
CPU: L2 cache: 6144K
CPU: Physical Processor ID: 1
CPU: Processor Core ID: 3
Intel(R) Xeon(R) CPU E5410 @ 2.33GHz stepping 06
checking TSC synchronization [CPU#3 -> CPU#7]: passed.
CPU0 attaching sched-domain:
domain 0: span 11
groups: 01 10
domain 1: span ff
groups: 11 22 44 88
CPU1 attaching sched-domain:
domain 0: span 22
groups: 02 20
domain 1: span ff
groups: 22 44 88 11
CPU2 attaching sched-domain:
domain 0: span 44
groups: 04 40
domain 1: span ff
groups: 44 88 11 22
CPU3 attaching sched-domain:
domain 0: span 88
groups: 08 80
domain 1: span ff
groups: 88 11 22 44
CPU4 attaching sched-domain:
domain 0: span 11
groups: 10 01
domain 1: span ff
groups: 11 22 44 88
CPU5 attaching sched-domain:
domain 0: span 22
groups: 20 02
domain 1: span ff
groups: 22 44 88 11
CPU6 attaching sched-domain:
domain 0: span 44
groups: 40 04
domain 1: span ff
groups: 44 88 11 22
CPU7 attaching sched-domain:
domain 0: span 88
groups: 80 08
domain 1: span ff
groups: 88 11 22 44
----
As if cpusets do not exist :).
With the patch we now do the right thing when cpus go off/online.
----
CPU0 attaching NULL sched-domain.
CPU1 attaching NULL sched-domain.
CPU2 attaching NULL sched-domain.
CPU3 attaching NULL sched-domain.
CPU4 attaching NULL sched-domain.
CPU5 attaching NULL sched-domain.
CPU6 attaching NULL sched-domain.
CPU7 attaching NULL sched-domain.
CPU0 attaching sched-domain:
domain 0: span 0f
groups: 01 02 04 08
CPU1 attaching sched-domain:
domain 0: span 0f
groups: 02 04 08 01
CPU2 attaching sched-domain:
domain 0: span 0f
groups: 04 08 01 02
CPU3 attaching sched-domain:
domain 0: span 0f
groups: 08 01 02 04
CPU0 attaching NULL sched-domain.
CPU1 attaching NULL sched-domain.
CPU2 attaching NULL sched-domain.
CPU3 attaching NULL sched-domain.
CPU4 attaching NULL sched-domain.
CPU5 attaching NULL sched-domain.
CPU6 attaching NULL sched-domain.
CPU7 attaching NULL sched-domain.
CPU0 attaching sched-domain:
domain 0: span 0f
groups: 01 02 04 08
CPU1 attaching sched-domain:
domain 0: span 0f
groups: 02 04 08 01
CPU2 attaching sched-domain:
domain 0: span 0f
groups: 04 08 01 02
CPU3 attaching sched-domain:
domain 0: span 0f
groups: 08 01 02 04
CPU 7 is now offline
CPU0 attaching NULL sched-domain.
CPU1 attaching NULL sched-domain.
CPU2 attaching NULL sched-domain.
CPU3 attaching NULL sched-domain.
CPU4 attaching NULL sched-domain.
CPU5 attaching NULL sched-domain.
CPU6 attaching NULL sched-domain.
CPU0 attaching sched-domain:
domain 0: span 0f
groups: 01 02 04 08
CPU1 attaching sched-domain:
domain 0: span 0f
groups: 02 04 08 01
CPU2 attaching sched-domain:
domain 0: span 0f
groups: 04 08 01 02
CPU3 attaching sched-domain:
domain 0: span 0f
groups: 08 01 02 04
Booting processor 7/8 APIC 0x7
Initializing CPU#7
Calibrating delay using timer specific routine.. 4655.37 BogoMIPS (lpj=9310749)
CPU: L1 I cache: 32K, L1 D cache: 32K
CPU: L2 cache: 6144K
CPU: Physical Processor ID: 1
CPU: Processor Core ID: 3
Intel(R) Xeon(R) CPU E5410 @ 2.33GHz stepping 06
checking TSC synchronization [CPU#3 -> CPU#7]: passed.
Max
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists