[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <481678F5.7080504@jp.fujitsu.com>
Date: Tue, 29 Apr 2008 10:25:09 +0900
From: Hidetoshi Seto <seto.hidetoshi@...fujitsu.com>
To: Rusty Russell <rusty@...tcorp.com.au>, linux-kernel@...r.kernel.org
Subject: [PATCH 0/3] patches for stop_machine
Hi Rusty and all,
This is a proposal of minor improvement for kernel/stop_machine.c
[PATCH 1/3] stop_machine: short exit path for if we cannot create enough threads
[PATCH 2/3] stop_machine: add timeout for child thread deployment
[PATCH 3/3] stop_machine: add stopmachine_timeout sysctl entry
The main topic is "how about adding timeout for stop_machine?"
I think it will act as a safety net.
For example (of silly situation), system can hung with following way:
# ./silly.sh
run an evil loop task on AP
pid 6138's current affinity mask: ff
pid 6138's new affinity mask: fe
to pretend lock up, chrt -f -p 99 6138
loop[6138] is on CPU #4
to do stopmachine, try to off #7
echo 0 > /sys/devices/system/cpu/cpu7/online
(never return)
After applying patch set here, it can be prevented.
# ./silly.sh
:
echo 0 > /sys/devices/system/cpu/cpu7/online
stopmachine: Failed to stop machine in time(5s). Are there any CPUs on file?
./silly.sh: line 22: echo: write error: Device or resource busy
offline is failed
OK, kill evil loop[6138]
try to off #7 again
echo 0 > /sys/devices/system/cpu/cpu7/online
CPU #7 is now offline
done!
Please refer description of each patch for the detail.
All comments are welcomed.
Thanks,
H.Seto
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists