[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <487EFB71.BA47.005A.0@novell.com>
Date: Thu, 17 Jul 2008 05:57:37 -0600
From: "Gregory Haskins" <ghaskins@...ell.com>
To: "Max Krasnyansky" <maxk@...lcomm.com>
Cc: <a.p.zijlstra@...llo.nl>, <mingo@...e.hu>,
<dmitry.adamushko@...il.com>, <torvalds@...ux-foundation.org>,
<pj@....com>, <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH] cpu hotplug, sched: Introduce
cpu_active_map and redoscheddomain managment (take 2)
>>> On Thu, Jul 17, 2008 at 3:16 AM, in message <487EF1E9.2040101@...lcomm.com>,
Max Krasnyansky <maxk@...lcomm.com> wrote:
>
> Gregory Haskins wrote:
>> Well, admittedly I am not entirely clear on what problem is being solved as
>> I was not part of the original thread with Linus. My impression of what you
>> were trying to solve was to eliminate the need to rebuild the domains for a
>> hotplug event (which I think is a good problem to solve), thus eliminating
>> some complexity and (iiuc) races there.
>>
>> However, based on what you just said, I am not sure I've got that entirely
>> right anymore. Can you clarify the intent (or point me at the original
> thread)
>> so we are on the same page?
> Here is the link to the original thread
> http://lkml.org/lkml/2008/7/11/328
> And here is where Linus explained the idea
> http://lkml.org/lkml/2008/7/12/137
>
> I'll reply to the rest of your email tomorrow (can't keep my yes open any
> longer :)).
>
> Max
Hi Max,
Thanks for the pointers. I see that I did indeed misunderstand the intent of the patch.
It seems you already solved the rebuild problem, and were just trying to solve the
"migrate to a dead cpu" problem that Linus mentions as a solution with cpu_active_map.
In that case, note that rq->rd->online already fits the bill, I believe. In a nutshell,
rq->rd->span contains all the cpus within your disjoint cpuset, and rq->rd->online,
contains the subset of rq->rd->span that are online. The online bit is cleared at the
earliest point in cpu hotplug removal (DYING), and it is set at the very latest point on
insertion (ONLINE). Therefore it is redundant with the cpus_active_map concept.
I think the simplest solution is to make sure that we cpus_and against rq->rd->online
before allowing a migration. This is how I intended the mask to be used, anyway. Its
what the RT scheduler does. It sounds like we just need to touch up the few places
in the CFS side that were causing those oops.
Thoughts?
-Greg
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists