[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <1197932426.18713.129.camel@perihelion>
Date: Mon, 17 Dec 2007 18:00:26 -0500
From: Jon Masters <jcm@...hat.com>
To: Michal Schmidt <mschmidt@...hat.com>
Cc: linux-kernel@...r.kernel.org,
"Eric W. Biederman" <ebiederm@...ssion.com>,
Andrew Morton <akpm@...ux-foundation.org>,
Satoru Takeuchi <takeuchi_satoru@...fujitsu.com>
Subject: Re: [PATCH] kthread: run kthreadd with max priority SCHED_FIFO
On Mon, 2007-12-17 at 23:43 +0100, Michal Schmidt wrote:
> kthreadd, the creator of other kernel threads, runs as a normal
> priority task. This is a potential for priority inversion when a task
> wants to spawn a high-priority kernel thread. A middle priority
> SCHED_FIFO task can block kthreadd's execution indefinitely and thus
> prevent the timely creation of the high-priority kernel thread.
>
> This causes a practical problem. When a runaway real-time task is
> eating 100% CPU and we attempt to put the CPU offline, sometimes we
> block while waiting for the creation of the highest-priority
> "kstopmachine" thread.
>
> The fix is to run kthreadd with the highest possible SCHED_FIFO
> priority. Its children must still run as slightly negatively reniced
> SCHED_NORMAL tasks.
>
> Signed-off-by: Michal Schmidt <mschmidt@...hat.com>
>
> diff --git a/kernel/kthread.c b/kernel/kthread.c
> index dcfe724..a7ce932 100644
> --- a/kernel/kthread.c
> +++ b/kernel/kthread.c
> @@ -94,10 +94,17 @@ static void create_kthread(struct kthread_create_info *create)
> if (pid < 0) {
> create->result = ERR_PTR(pid);
> } else {
> + struct sched_param param = { .sched_priority = 0 };
> wait_for_completion(&create->started);
> read_lock(&tasklist_lock);
> create->result = find_task_by_pid(pid);
> read_unlock(&tasklist_lock);
> + /*
> + * We (kthreadd) run with SCHED_FIFO, but we don't want
> + * the kthreads we create to have it too by default.
> + */
> + sched_setscheduler(create->result, SCHED_NORMAL, ¶m);
> + set_user_nice(create->result, -5);
> }
> complete(&create->done);
> }
> @@ -217,11 +224,12 @@ EXPORT_SYMBOL(kthread_stop);
> int kthreadd(void *unused)
> {
> struct task_struct *tsk = current;
> + struct sched_param param = { .sched_priority = MAX_RT_PRIO - 1 };
>
> /* Setup a clean context for our children to inherit. */
> set_task_comm(tsk, "kthreadd");
> ignore_signals(tsk);
> - set_user_nice(tsk, -5);
> + sched_setscheduler(tsk, SCHED_FIFO, ¶m);
> set_cpus_allowed(tsk, CPU_MASK_ALL);
>
> current->flags |= PF_NOFREEZE;
I looked at this internally over the weekend.
Acked-by: Jon Masters <jcm@...hat.com>
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists