[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <alpine.LFD.1.10.0807121556290.2959@woody.linux-foundation.org>
Date: Sat, 12 Jul 2008 16:01:26 -0700 (PDT)
From: Linus Torvalds <torvalds@...ux-foundation.org>
To: Max Krasnyansky <maxk@...lcomm.com>
cc: Dmitry Adamushko <dmitry.adamushko@...il.com>,
Vegard Nossum <vegard.nossum@...il.com>,
Paul Menage <menage@...gle.com>, Paul Jackson <pj@....com>,
Peter Zijlstra <a.p.zijlstra@...llo.nl>, miaox@...fujitsu.com,
rostedt@...dmis.org, Thomas Gleixner <tglx@...utronix.de>,
Ingo Molnar <mingo@...e.hu>,
Linux Kernel <linux-kernel@...r.kernel.org>
Subject: Re: current linux-2.6.git: cpusets completely broken
On Sat, 12 Jul 2008, Max Krasnyansky wrote:
>
> My vote goes for Dmitry's patch. The one with the full switch() statement.
> Your simplified version with if() is correct (I think) but the switch() is
> more explicit about what events are being processed.
Well, I still haven't seen a combined patch+signoff+good explanation, so I
can't really commit it.
> The cpu_active_map thing seems like an overkill. In a sense that we should not
> try to add a new map for every such case. Granter this migration case may be
> special enough to warrant the new map but in general I think it's not the
> right way to go.
Note how cpu_active_map has nothing to do with cpusets per se, and
everything to do with the fact that CPU migration currently seems to be
fundamentally flawed in the presense of a CPU hotunplug.
Can somebody tell me why some _other_ random wakeup cannot cause the same
kind of migration at an inopportune time?
The fact is, Dmitry's patch fixed _one_ particular wakeup from happening
(that just happened to be *guaranteed* to happen when it shouldn't!), but
as far as I can tell, it's a totally generic problem, with any
try_to_wake_up() -> load-balancer
chain being able to trigger it by causing a migration to a CPU that we
are in the process of turning off.
IOW, I don't think that my patch is overkill at all. I think it fixes the
real bug there.
(It's also true that the cpusets code calls rebuild_sched_domains() way
too much, but that's a _stupidity_ issue, not the cause of the bug per se,
if I follow the code!)
Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists