[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <488635B3.2050606@qualcomm.com>
Date: Tue, 22 Jul 2008 12:32:03 -0700
From: Max Krasnyansky <maxk@...lcomm.com>
To: Gregory Haskins <ghaskins@...ell.com>
CC: Peter Zijlstra <a.p.zijlstra@...llo.nl>, mingo@...e.hu,
dmitry.adamushko@...il.com, torvalds@...ux-foundation.org,
pj@....com, linux-kernel@...r.kernel.org
Subject: Re: [PATCH] cpu hotplug, sched:Introduce cpu_active_map and redoscheddomainmanagment
(take 2)
Gregory Haskins wrote:
> Max Krasnyansky wrote:
>> Greg, correct me if I'm wrong but we seem to have exact same issue
>> with the rq->rq->online map. Lets take "cpu going down" for
>> example. We're clearing rq->rd->online bit on DYING event, but
>> nothing AFAICS prevents another cpu calling
>> rebuild_sched_domains()->partition_sched_domains() in the middle of
>> the hotplug sequence. partition_sched_domains() will happily reset
>> rd->rq->online mask and things will fail. I'm talking about this
>> path
>>
>> __build_sched_domains() -> cpu_attach_domain() -> rq_attach_root()
>> ...
>> cpu_set(rq->cpu, rd->span);
>> if (cpu_isset(rq->cpu, cpu_online_map))
>> set_rq_online(rq);
>> ...
>>
>>
>
> I think you are right, but wouldn't s/online/active above fix that as
> well? The active_map didnt exist at the time that code went in
> initially ;)
Actually after a bit more thinking :) I realized that the scenario I
explained above cannot happen because partition_sched_domains() must be
called under get_online_cpus() and the set_rq_online() happens in the
hotplug writer's path (ie under cpu_hotplug.lock). Since I unified all
the other domain rebuild paths (arch_reinit_sched_domains, etc) we
should be safe. But it again means we'd rely on those intricate
dependencies that we wanted to avoid with the cpu_active_map. Also
cpusets might still need to rebuild the domains in the hotplug writer's
path.
So it's better to fix it once and for all :)
>> --
>>
>> btw Why didn't we convert sched*.c to use rq->rd->online when it was
>> introduced ? ie Instead of using cpu_online_map directly.
>>
> I think things were converted where they made sense to convert. But we
> also had a different goal at that time, so perhaps something was
> missed. If you think something else should be converted, please point
> it out.
Ok. I'll keep an eye on it.
> In the meantime, I would suggest we consider this patch on top of yours
> (applies to tip/sched/devel):
>
> ----------------------
>
> sched: Fully integrate cpus_active_map and root-domain code
> Signed-off-by: Gregory Haskins <ghaskins@...ell.com>
Ack.
The only thing I'm a bit unsure of is the error scenarios in the cpu
hotplug event sequence. online_map is not cleared when something in the
notifier chain fails, but active_map is.
Max
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists