lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <87veezeodr.wl%takeuchi_satoru@jp.fujitsu.com>
Date:	Fri, 11 May 2007 17:49:20 +0900
From:	Satoru Takeuchi <takeuchi_satoru@...fujitsu.com>
To:	Rusty Russell <rusty@...tcorp.com.au>
Cc:	Satoru Takeuchi <takeuchi_satoru@...fujitsu.com>,
	Linux Kernel <linux-kernel@...r.kernel.org>,
	Srivatsa Vaddagiri <vatsa@...ibm.com>,
	Zwane Mwaikambo <zwane@....linux.org.uk>,
	Nathan Lynch <nathanl@...tin.ibm.com>,
	Joel Schopp <jschopp@...tin.ibm.com>,
	Ashok Raj <ashok.raj@...el.com>,
	Heiko Carstens <heiko.carstens@...ibm.com>,
	Gautham R Shenoy <ego@...ibm.com>
Subject: [PATCH 1/2] Fix stop_machine_run problem with naughty real time process

Hi,

I wrote patches which fixes the problem regarding stop_machine_run() and
cpu hotplug.

stop_machine_run() can't accomplish its work if there is a real time process
on the CPU on which "kstopmachine" kernel thread is running. For more details,
please refer to the following thread:

  http://lkml.org/lkml/2007/5/7/41

TEST RESULT:

I did the following test on my ia64 box. It works fine:

-------------------------------------------------------------------------------
# cat loop.sh
while true ; do
	:
done
-------------------------------------------------------------------------------
# cat test_stop_machine_run_with_rt_proc.sh
#!/bin/sh

taskset 0x2 chrt -f 98 ./loop.sh &
PID=${!}
echo 0 >/sys/devices/system/cpu/cpu1/online
kill ${PID}
echo 1 >/sys/devices/system/cpu/cpu1/online
-------------------------------------------------------------------------------

To do the test, just issue the following command.

# ./test_stop_machine_run_with_rt_proc.sh
# 

TODO list
=========

Some more works are needed. See the TODO list.

 - If there is a SCHED_FIFO process having max priority, stop_machine_run doesn't
   work because kstopmachine doesn't be scheduled.

     -> I'm trying to fix this problem, see the followings:

        http://lkml.org/lkml/2007/5/8/620

        I would submit RFC patches in 1 weeks.

 - On CPU hot removal, if that RT process is migrated to the CPU on which
   stop_machine_run() is running, stop_machine_run can't continue to run.

     -> I'm trying to fix this problem.

 - Other `stop_machine_run() with FIFO` problem might exist.

     -> I've not research other subsystem using stop_machine_run yet.


# FYI, I'll be offline for 2 days.

Thanks,

Satoru

---
Fix stop_machine_run() problem with naughty real time process

stop_machine_run() does its work on "kstopmachine" thread having max priority.
However that thread get such priority after woken up. Therefore, in the
following case ...

  - "kstopmachine" try to run on CPU1
  - There is a real time process which doesn't relinquish CPU time voluntary on CPU1

... "kstopmachine" can't start to run and the CPU on which stop_machine_run() is runing
hangs up. To fix this problem, call sched_setscheduler() before waking up that thread.

Signed-off-by: Satoru Takeuchi <takeuchi_satoru@...fujitsu.com>

Index: linux-2.6.21/kernel/stop_machine.c
===================================================================
--- linux-2.6.21.orig/kernel/stop_machine.c	2007-05-11 13:45:34.000000000 +0900
+++ linux-2.6.21/kernel/stop_machine.c	2007-05-11 14:49:17.000000000 +0900
@@ -89,10 +89,6 @@ static void stopmachine_set_state(enum s
 static int stop_machine(void)
 {
 	int i, ret = 0;
-	struct sched_param param = { .sched_priority = MAX_RT_PRIO-1 };
-
-	/* One high-prio thread per cpu.  We'll do this one. */
-	sched_setscheduler(current, SCHED_FIFO, &param);
 
 	atomic_set(&stopmachine_thread_ack, 0);
 	stopmachine_num_threads = 0;
@@ -184,6 +180,10 @@ struct task_struct *__stop_machine_run(i
 
 	p = kthread_create(do_stop, &smdata, "kstopmachine");
 	if (!IS_ERR(p)) {
+		struct sched_param param = { .sched_priority = MAX_RT_PRIO-1 };
+		
+		/* One high-prio thread per cpu.  We'll do this one. */
+		sched_setscheduler(p, SCHED_FIFO, &param);
 		kthread_bind(p, cpu);
 		wake_up_process(p);
 		wait_for_completion(&smdata.done);
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ