[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <4997672B.1000301@fatooh.org>
Date: Sat, 14 Feb 2009 16:51:55 -0800
From: Corey Hickey <bugfood-ml@...ooh.org>
To: linux-kernel@...r.kernel.org
Subject: RT scheduling and a way to make a process hang, unkillable
Hello,
I've encountered a bit of a problem in recent kernels that include
"Group scheduling for SCHED_RR/FIFO": it is possible for a process run
by root to hang itself and become unkillable--even by a 'kill -9'.
The following kernel options must be set:
CONFIG_GROUP_SCHED=y
CONFIG_RT_GROUP_SCHED=y
CONFIG_USER_SCHED=y
The procedure is for a program to:
1. run as root
2. set SCHED_FIFO
3. change UID to a user with no realtime CPU share allocated
I'm attaching a test program that does exactly this. Run it with no
arguments or examine the print_usage() function to see detailed
information. Briefly, though, it should be run as root with the path to
a program to exec, like:
# ./hangme /bin/bash
The program hangs in a "running" state, like this:
nobody 4357 0.0 0.0 904 16 pts/1 R+ 16:09 0:00 /bin/bash
The only way to kill the program is to allocate the corresponding user
some realtime CPU share:
echo 10000 > /sys/kernel/uids/65534/cpu_rt_runtime
This may or may not actually be a bug, but I think it's at least
confusing and unexpected. I had a difficult time narrowing this down
from a problem I was having with Debian's slmodemd package. I think it
would be much nicer for setuid() to return an error if the process is
realtime and the target user doesn't have any CPU share allocated (if
that's feasible).
This problem is similar in principle to a bug reported by Rafael J.
Wysocki on 2008-02-01, and which was subsequently fixed:
http://lkml.org/lkml/2008/1/31/490
http://lkml.org/lkml/2008/2/4/332
If I understand correctly, that was a case in which a program would hang
by doing the following:
1. run setuid-root
2. set SCHED_FIFO
3. change effective UID to match real UID
The difference in my case is that the program is running with root's
real UID as well as effective UID, so, at the time SCHED_FIFO is set,
there's no reason to deny realtime priority. My program changes real UID
_after_ setting SCHED_FIFO, and that's what causes the hang.
I've run my test program, with the same results, on the following kernels:
2.6.26
2.6.28
2.6.29-rc5
Warning! Under 2.6.28 it is impossible to allocate users CPU share, and
the program will not be killable:
http://lkml.org/lkml/2009/1/14/113
I'm also attaching my kernel configuration. Please let me know if you'd
like more information or for me to test a patch.
Thank you,
Corey
View attachment "hangme.c" of type "text/x-csrc" (2715 bytes)
Download attachment "config-2.6.29-rc5.bz2" of type "application/octet-stream" (6606 bytes)
Powered by blists - more mailing lists