[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <b647ffbd0807141700s20e54fbewafb3d3e296e57f53@mail.gmail.com>
Date: Tue, 15 Jul 2008 02:00:32 +0200
From: "Dmitry Adamushko" <dmitry.adamushko@...il.com>
To: "Linus Torvalds" <torvalds@...ux-foundation.org>
Cc: "Vegard Nossum" <vegard.nossum@...il.com>,
"Paul Menage" <menage@...gle.com>,
"Max Krasnyansky" <maxk@...lcomm.com>, "Paul Jackson" <pj@....com>,
"Peter Zijlstra" <a.p.zijlstra@...llo.nl>, miaox@...fujitsu.com,
rostedt@...dmis.org, "Thomas Gleixner" <tglx@...utronix.de>,
"Ingo Molnar" <mingo@...e.hu>,
"Linux Kernel" <linux-kernel@...r.kernel.org>
Subject: Re: current linux-2.6.git: cpusets completely broken
2008/7/15 Linus Torvalds <torvalds@...ux-foundation.org>:
>
> On Tue, 15 Jul 2008, Dmitry Adamushko wrote:
>>
>> cpu_clear(cpu, cpu_active_map); _alone_ does not guarantee that after
>> its completion, no new tasks can appear on (be migrated to) 'cpu'.
>
> But I think we should make it do that.
>
> I do realize that we "queue" processes, but that's part of the whole
> complexity. More importantly, the people who do that kind of asynchronous
> queueing don't even really care - *if* they cared about the process
> _having_ to show up on the destination core, they'd be waiting
> synchronously and re-trying (which they do).
>
> So by doing the test for cpu_active_map not at queuing time, but at the
> time when we actually try to do the migration,
> we can now also make that
> cpu_active_map be totally serialized.
>
> (Of course, anybody who clears the bit does need to take the runqueue lock
> of that CPU too, but cpu_down() will have to do that as it does the
> "migrate away live tasks" anyway, so that's not a problem)
The 'synchronization' point occurs even earlier - when cpu_down() ->
__stop_machine_run() gets called (as I described in my previous mail).
My point was that if it's ok to have a _delayed_ synchronization
point, having it not immediately after cpu_clear(cpu, cpu_active_map)
but when the "runqueue lock" is taken a bit later (as you pointed out
above) or __stop_machine_run() gets executed (which is a sync point,
scheduling-wise),
then we can implement the proper synchronization (hotplugging vs.
task-migration) with cpu_online_map (no need for cpu_active_map).
Note, currently, _not_ all places in the scheduler where an actual
migration (not just queuing requests) takes place do the test for
cpu_offline(). Instead, they (blindly) rely on the assumption that if
a cpu is available via sched-domains, then it's guaranteed to be
online (and can be migrated to).
Provided all those places had cpu_offline() (additionally) in place,
the bug which has been discussed in this thread would _not_ happen
and, moreover, we would _not_ need to do all the fancy "attach NULL
domains" sched-domain manipulations (which depend on DOWN_PREPARE,
DOWN and other hotpluging events). We would only need to rebuild
domains once upon CPU_DOWN (on success).
p.s. hope my point is more understandable now (or it's clear that I'm
missing something at this late hour :^)
>
> Linus
>
--
Best regards,
Dmitry Adamushko
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists