[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20061007103559.GC30034@elf.ucw.cz>
Date: Sat, 7 Oct 2006 12:35:59 +0200
From: Pavel Machek <pavel@....cz>
To: Bryce Harrington <bryce@...l.org>
Cc: Andrew Morton <akpm@...l.org>,
kernel list <linux-kernel@...r.kernel.org>
Subject: Re: Status on CPU hotplug issues
On Fri 2006-10-06 17:00:31, Bryce Harrington wrote:
> On Fri, Oct 06, 2006 at 04:29:24PM -0700, Andrew Morton wrote:
> > Can you describe the nature of the cpu-hotplug tests you're running? I'd
> > be fairly staggered if the kernel was able to survive a full-on cpu-hotplug
> > stress test for more than one second, frankly. There's a lot of code in
> > there which is non-hotplug-aware. Running a non-preemptible kernel would
> > make things appear more stable, perhaps.
>
> Certainly, the testsuite is one the OSDL Hotplug SIG put together last
> summer, and consists of several test cases:
>
> http://developer.osdl.org/dev/HOTPLUG/planning/hotplug_cpu_test_plan_status.html
Page actually lists test 1-6.
> hotplug01: Check IRQ behavior during cpu hotplug events
> hotplug02: Check process migration during cpu hotplug events
> hotplug03: Verify tasks get scheduled on newly onlined cpu's
> hotplug04: Verify disallowing offlining all CPU's
> hotplug05: (Unimplemented)
> hotplug06: Check userspace tools (sar, top) during cpu hotplug events
> hotplug07: Stress case doing kernel compile while cpu's are
> hotplugged on and off repeatedly
Well, while nice for "it basically works", that will not stress
hotplug subsystem too badly.
If you want some real nasty tests:
hotplug_locking: create 10 threads, make them try to online/offline
random cpus, all in paralel. (This is what I was doing in smaller
scale). You'll get some expected errors (like cpu already up), but
system should survive.
cpufreq: change cpufreq parameters on cpu (toggling min/max
frequency?) while trying to online/offline that cpu from another
thread.
suspend: swapoff -a, then proceed like in hotplug_locking, while
trying to suspend machine to disk (it will immediately wake up because
of no swap available). Should be useful at pointing out bugs in
suspend code. (but quite tricky to setup the test, so you may or may
not want to do this one).
> We've been running this testsuite fairly continuously for several
> months, and irregularly for about a year before that. We find that on
> some platforms like PPC64 it's quite robust, and on others there are
> issues, but the developers tend to be quick to provide fixes as the
> issues are found. I'm glad to see that the results are finally showing
> green for ia64.
Hmm, perhaps you should add ppc64 to the hotplug_report.html, so that
some green can be seen :-).
Pavel
--
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists