[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <46411A36.2050609@yahoo.com.au>
Date: Wed, 09 May 2007 10:47:50 +1000
From: Nick Piggin <nickpiggin@...oo.com.au>
To: Satoru Takeuchi <takeuchi_satoru@...fujitsu.com>
CC: vatsa@...ibm.com, Rusty Russell <rusty@...tcorp.com.au>,
Linux Kernel <linux-kernel@...r.kernel.org>,
Zwane Mwaikambo <zwane@....linux.org.uk>,
Nathan Lynch <nathanl@...tin.ibm.com>,
Joel Schopp <jschopp@...tin.ibm.com>,
Ashok Raj <ashok.raj@...el.com>,
Heiko Carstens <heiko.carstens@...ibm.com>,
Gautham R Shenoy <ego@...ibm.com>,
Ingo Molnar <mingo@...e.hu>, paulmck@...ibm.com
Subject: Re: [BUG] cpu-hotplug: Can't offline the CPU with naughty realtime
processes
Satoru Takeuchi wrote:
> At Tue, 8 May 2007 22:18:50 +0530,
> Srivatsa Vaddagiri wrote:
>
>>On Tue, May 08, 2007 at 04:16:06PM +0900, Satoru Takeuchi wrote:
>>
>>>Sometimes I wonder at prio_array. It has 140 entries(from 0 to 139),
>>>and the meaning of each entry is as follows, I think.
>>>
>>>+-----------+-----------------------------------------------+
>>>| index | usage |
>>>+-----------+-----------------------------------------------+
>>>| 0 - 98 | RT processes are here. They are in the entry |
>>>| | whose index is 99 - sched_priority. |
>>
>>>>From sched.h:
>>
>>/*
>> * Priority of a process goes from 0..MAX_PRIO-1, valid RT
>> * priority is 0..MAX_RT_PRIO-1, and SCHED_NORMAL/SCHED_BATCH
>> * tasks are in the range MAX_RT_PRIO..MAX_PRIO-1.
>>
>>so shouldn't the index for RT processes be 0 - 99, given that
>>MAX_RT_PRIO = 100?
>
>
> However `man sched_priority' says...
>
>
> Processes scheduled with SCHED_OTHER or SCHED_BATCH must
> be assigned the static priority 0. Processes scheduled
> under SCHED_FIFO or SCHED_RR can have a static priority
> in the range 1 to 99. The system calls
> sched_get_priority_min() and sched_get_priority_max() can
> be used to find out the valid priority range for a
> scheduling policy in a portable way on all POSIX.1-2001
> conforming systems.
>
>
> and see the kernel/sched.c ...
>
>
> int sched_setscheduler(struct task_struct *p, int policy,
> struct sched_param *param)
> {
> ...
> /*
> * Valid priorities for SCHED_FIFO and SCHED_RR are
> * 1..MAX_USER_RT_PRIO-1, valid priority for SCHED_NORMAL and
> * SCHED_BATCH is 0.
> */
> if (param->sched_priority < 0 ||
> (p->mm && param->sched_priority > MAX_USER_RT_PRIO-1) ||
> (!p->mm && param->sched_priority > MAX_RT_PRIO-1))
> return -EINVAL;
> if (is_rt_policy(policy) != (param->sched_priority != 0))
> return -EINVAL;
> ...
> }
>
>
> So, if I want to set the rt_prio of a kernel_thread, we can't use this
> entry unless set t->prio to 99 directly. I don't know whether we are
> allowed to write such code bipassing sched_setscheduler(). In addition,
> even if kernel_thread can use this index , I can't understand it's usage.
> It can only be used by kernel, but its priority is LOWER than any real
> time thread.
>
> If the rule can be changed to the following...
>
> +-----------+-----------------------------------------------+
> | index | usage |
> +-----------+-----------------------------------------------+
> | 0 | RT processes are here. Only kernel can use |
> | | this entry. |
> +-----------+-----------------------------------------------+
> | 1 - 99 | RT processes are here. They are in the entry |
> | | whose index is 99 - sched_priority. |
> +-----------+-----------------------------------------------+
> | 100 - 139 | Ordinally processes are here. They are in the |
> | | entry whose index is (nice+120) +/- 5 |
> +-----------+-----------------------------------------------+
>
> ... there will be an entry only used by kernel and its priority is HIGHER
> than any user process, and I'll get happy :-)
We've seen the same problem with other stop_machine_run sites in the kernel.
module remove was one.
Reserving the top priority slot for stop machine (and migration thread, I
guess) isn't a bad idea.
--
SUSE Labs, Novell Inc.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists