[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <4F32174E.2050207@linux.vnet.ibm.com>
Date: Wed, 08 Feb 2012 12:03:50 +0530
From: "Srivatsa S. Bhat" <srivatsa.bhat@...ux.vnet.ibm.com>
To: Peter Zijlstra <a.p.zijlstra@...llo.nl>
CC: paul@...lmenage.org, mingo@...e.hu, rjw@...k.pl, tj@...nel.org,
frank.rowand@...sony.com, pjt@...gle.com, tglx@...utronix.de,
lizf@...fujitsu.com, prashanth@...ux.vnet.ibm.com,
paulmck@...ux.vnet.ibm.com, vatsa@...ux.vnet.ibm.com,
linux-kernel@...r.kernel.org, linux-pm@...r.kernel.org,
"akpm@...ux-foundation.org" <akpm@...ux-foundation.org>
Subject: Re: [PATCH 0/4] CPU hotplug, cpusets: Fix CPU online handling related
to cpusets
On 02/08/2012 08:52 AM, Peter Zijlstra wrote:
> On Wed, 2012-02-08 at 00:25 +0530, Srivatsa S. Bhat wrote:
>> There is a very long standing issue related to how cpusets handle CPU
>> hotplug events. The problem is that when a CPU goes offline, it is removed
>> from all cpusets. However, when that CPU comes back online, it is added
>> *only* to the root cpuset. Which means, any task attached to a cpuset lower
>> in the hierarchy will have one CPU less in its cpuset, though it had this
>> CPU in its cpuset before the CPU went offline.
>
> Yeah so? That's known behaviour..
This might be a known behaviour, but this is surely not the behaviour we
want right? I understand that if you take a CPU offline, we have no other
choice but to remove it from all cpusets. But if the same CPU comes back
online and the userspace did not request any change to cpusets in between
those events (offline-online), then is it not wrong to silently keep that
CPU out of the cpuset even when it comes online?
IOW, consider:
cpuset A has 0-10
- Take CPU 10 offline
[We are forced to remove CPU 10 from cpuset A, which becomes 0-9 now]
<Userspace didn't request any change to cpuset A>
- Bring back CPU 10 online
Now cpuset A is still 0-9! IMO, it should have been 0-10.
That is, why should a totally unrelated operation like CPU Hotplug alter
the cpuset silently under the hood? Or, put another way, if the kernel
is intelligent enough to restore the root cpuset on CPU hotplug events,
why should it not restore the rest of the cpusets?
>
>> The issue gets enormously aggravated in the case of suspend/resume.
>
> Why does suspend resume does this anyway? hotunplug is terribly
> expensive, surely not doing it would make suspend ever so much faster?
>
Well, the point I am trying to make is not about speeding up suspend/resume
itself. I am trying to say that there is a bug (or atleast an "undesirable
behaviour" if you feel "bug" is too strong a word to use) in cpu hotplug
handling in cpusets which gets magnified during suspend/resume
(agreed, because suspend/resume relies on cpu hotplug at the moment).
[And one of the promises of suspend/resume is to restore the system to its
original state to the best extent it can. And cpusets is clearly breaking
this promise. And the good news is: this luckily falls under our "things
that we *can* restore after resume" list and this patchset achieves this.]
>> During
>> suspend, all non-boot CPUs are taken offline. Which means, all those CPUs
>> get removed from all the cpusets. When the system resumes, all CPUs are
>> brought back online; however, the newly onlined CPUs get added only to the
>> root cpuset - and all other cpusets have cpuset.cpus = 0 (boot cpu alone)!
>> This means, (as is obvious), all those tasks attached to non-root cpusets
>> will be constrained to run only on one single cpu!
>>
>> So, imagine the amount of performance degradation after suspend/resume!!
>>
>> In particular, libvirt is one of the active users of cpusets. And apparently,
>> people hit this problem long ago:
>> https://bugzilla.redhat.com/show_bug.cgi?id=714271
>>
>> But unfortunately this never got resolved since people probably thought that
>> the bug was in libvirt... and all this time the kernel was the culprit!
>
> /me boggles, why do you use cpusets on a system small enough to suspend,
> and I'm so not going to ask about libvirt because I know I'll just get
> sad.
>
Regards,
Srivatsa S. Bhat
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists