[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <alpine.DEB.2.00.1205151444260.1656@chino.kir.corp.google.com>
Date: Tue, 15 May 2012 14:49:47 -0700 (PDT)
From: David Rientjes <rientjes@...gle.com>
To: "Srivatsa S. Bhat" <srivatsa.bhat@...ux.vnet.ibm.com>
cc: Peter Zijlstra <a.p.zijlstra@...llo.nl>,
Nishanth Aravamudan <nacc@...ux.vnet.ibm.com>,
mingo@...nel.org, pjt@...gle.com, paul@...lmenage.org,
akpm@...ux-foundation.org, rjw@...k.pl, nacc@...ibm.com,
paulmck@...ux.vnet.ibm.com, tglx@...utronix.de,
seto.hidetoshi@...fujitsu.com, tj@...nel.org, mschmidt@...hat.com,
berrange@...hat.com, nikunj@...ux.vnet.ibm.com,
vatsa@...ux.vnet.ibm.com, liuj97@...il.com,
linux-kernel@...r.kernel.org, linux-pm@...r.kernel.org
Subject: Re: [PATCH v3 5/5] cpusets, suspend: Save and restore cpusets during
suspend/resume
On Wed, 16 May 2012, Srivatsa S. Bhat wrote:
> What you are suggesting was precisely the v1 of this patchset, which went
> upstream as commit 8f2f748b06562 (CPU hotplug, cpusets, suspend: Don't touch
> cpusets during suspend/resume).
>
> It got reverted due to a nasty suspend hang in some corner case, where the
> sched domains not being up-to-date got the scheduler confused.
> Here is the thread with that discussion:
> http://thread.gmane.org/gmane.linux.kernel/1262802/focus=1286289
>
> As Peter suggested, I'll try to fix the issues at the 2 places that I found
> where the scheduler gets confused despite the cpu_active mask being up-to-date.
>
> But, I really want to avoid that scheduler fix and this cpuset fix from
> being tied together, for the fear that until we root-cause and fix all
> scheduler bugs related to cpu_active mask, we can never get cpusets fixed
> once and for all for suspend/resume. So, this patchset does an explicit
> save and restore to be sure, and so that we don't depend on some other/unknown
> factors to make this work reliably.
>
Ok, so it seems like this is papering over an existing cpusets issue or an
interaction with the scheduler that is buggy. There's no reason why a
cpuset.cpus that is a superset of cpu_active_mask should cause an issue
since that's exactly what the root cpuset has. I know root is special
cased all over the cpuset code, but I think the real fix here is to figure
out why it can't be left as a superset and then we end up doing nothing
for s/r.
I don't have a preference for cpu hotplug and whether cpuset.cpus = 1-3
remains 1-3 when cpu 2 is offlined or not, I think it could be argued both
ways, but I disagree with saving the cpumask, removing all suspended cpus,
and then reinstating it for no reason.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists