[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <87veezeodr.wl%takeuchi_satoru@jp.fujitsu.com>
Date: Fri, 11 May 2007 17:49:20 +0900
From: Satoru Takeuchi <takeuchi_satoru@...fujitsu.com>
To: Rusty Russell <rusty@...tcorp.com.au>
Cc: Satoru Takeuchi <takeuchi_satoru@...fujitsu.com>,
Linux Kernel <linux-kernel@...r.kernel.org>,
Srivatsa Vaddagiri <vatsa@...ibm.com>,
Zwane Mwaikambo <zwane@....linux.org.uk>,
Nathan Lynch <nathanl@...tin.ibm.com>,
Joel Schopp <jschopp@...tin.ibm.com>,
Ashok Raj <ashok.raj@...el.com>,
Heiko Carstens <heiko.carstens@...ibm.com>,
Gautham R Shenoy <ego@...ibm.com>
Subject: [PATCH 1/2] Fix stop_machine_run problem with naughty real time process
Hi,
I wrote patches which fixes the problem regarding stop_machine_run() and
cpu hotplug.
stop_machine_run() can't accomplish its work if there is a real time process
on the CPU on which "kstopmachine" kernel thread is running. For more details,
please refer to the following thread:
http://lkml.org/lkml/2007/5/7/41
TEST RESULT:
I did the following test on my ia64 box. It works fine:
-------------------------------------------------------------------------------
# cat loop.sh
while true ; do
:
done
-------------------------------------------------------------------------------
# cat test_stop_machine_run_with_rt_proc.sh
#!/bin/sh
taskset 0x2 chrt -f 98 ./loop.sh &
PID=${!}
echo 0 >/sys/devices/system/cpu/cpu1/online
kill ${PID}
echo 1 >/sys/devices/system/cpu/cpu1/online
-------------------------------------------------------------------------------
To do the test, just issue the following command.
# ./test_stop_machine_run_with_rt_proc.sh
#
TODO list
=========
Some more works are needed. See the TODO list.
- If there is a SCHED_FIFO process having max priority, stop_machine_run doesn't
work because kstopmachine doesn't be scheduled.
-> I'm trying to fix this problem, see the followings:
http://lkml.org/lkml/2007/5/8/620
I would submit RFC patches in 1 weeks.
- On CPU hot removal, if that RT process is migrated to the CPU on which
stop_machine_run() is running, stop_machine_run can't continue to run.
-> I'm trying to fix this problem.
- Other `stop_machine_run() with FIFO` problem might exist.
-> I've not research other subsystem using stop_machine_run yet.
# FYI, I'll be offline for 2 days.
Thanks,
Satoru
---
Fix stop_machine_run() problem with naughty real time process
stop_machine_run() does its work on "kstopmachine" thread having max priority.
However that thread get such priority after woken up. Therefore, in the
following case ...
- "kstopmachine" try to run on CPU1
- There is a real time process which doesn't relinquish CPU time voluntary on CPU1
... "kstopmachine" can't start to run and the CPU on which stop_machine_run() is runing
hangs up. To fix this problem, call sched_setscheduler() before waking up that thread.
Signed-off-by: Satoru Takeuchi <takeuchi_satoru@...fujitsu.com>
Index: linux-2.6.21/kernel/stop_machine.c
===================================================================
--- linux-2.6.21.orig/kernel/stop_machine.c 2007-05-11 13:45:34.000000000 +0900
+++ linux-2.6.21/kernel/stop_machine.c 2007-05-11 14:49:17.000000000 +0900
@@ -89,10 +89,6 @@ static void stopmachine_set_state(enum s
static int stop_machine(void)
{
int i, ret = 0;
- struct sched_param param = { .sched_priority = MAX_RT_PRIO-1 };
-
- /* One high-prio thread per cpu. We'll do this one. */
- sched_setscheduler(current, SCHED_FIFO, ¶m);
atomic_set(&stopmachine_thread_ack, 0);
stopmachine_num_threads = 0;
@@ -184,6 +180,10 @@ struct task_struct *__stop_machine_run(i
p = kthread_create(do_stop, &smdata, "kstopmachine");
if (!IS_ERR(p)) {
+ struct sched_param param = { .sched_priority = MAX_RT_PRIO-1 };
+
+ /* One high-prio thread per cpu. We'll do this one. */
+ sched_setscheduler(p, SCHED_FIFO, ¶m);
kthread_bind(p, cpu);
wake_up_process(p);
wait_for_completion(&smdata.done);
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists