lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Date:	Mon, 09 Nov 2009 12:35:49 -0700
From:	"Gregory Haskins" <ghaskins@...ell.com>
To:	"Ingo Molnar" <mingo@...e.hu>, <Lucas.de.marchi@...il.com>,
	"Steven Rostedt" <rostedt@...dmis.org>
Cc:	"Peter Zijlstra" <peterz@...radead.org>,
	<linux-kernel@...r.kernel.org>
Subject: Re: [lucas.de.marchi@...il.com: Bug when changing cpus_allowed
	 of RT tasks?]

Hi Lucas,

Ingo asked me to take a look at the problem you are reporting.  Is this a bug you are seeing in the
wild, or was this found by code-inspection?  I took a look, and it looks to me that the original code
is correct, but I will keep an open mind.

In the meantime, I will try to provide some details about what is going on here.  Comments inline.

>>> On 11/8/2009 at  7:16 AM, in message <20091108121650.GB11372@...e.hu>, Ingo
Molnar <mingo@...e.hu> wrote: 

> FYI. If you reply then please do so on-list.
> 
> Thanks,
> 
> 	Ingo
> 
> ----- Forwarded message from Lucas De Marchi <lucas.de.marchi@...il.com> -----
> 
> Date: Mon, 26 Oct 2009 14:11:13 -0200
> From: Lucas De Marchi <lucas.de.marchi@...il.com>
> To: Peter Zijlstra <peterz@...radead.org>, Ingo Molnar <mingo@...e.hu>
> Cc: linux-kernel@...r.kernel.org
> Subject: Bug when changing cpus_allowed of RT tasks?
> 
> I think there's a problem when balancing RT tasks: both find_lowest_rq_rt 
> and
> select_task_rq_rt check for p->rt.nr_cpus_allowed being greater than 1 to wake
> a task in another cpu:
> 
> 	static int select_task_rq_rt(struct task_struct *p, int sd_flag, int flags)
> 	{
> 		[...]
> 		if (unlikely(rt_task(rq->curr)) &&
> 		    (p->rt.nr_cpus_allowed > 1)) {
> 			int cpu = find_lowest_rq(p);
> 
> 			return (cpu == -1) ? task_cpu(p) : cpu;
> 		}
> 		[...]
> 	}

So the intent of this logic is to say "if the task is of type RT, and it can move, see if it can move
elsewhere".  Otherwise, we do not try to move it at all.

> 
> 	static int find_lowest_rq(struct task_struct *task)
> 	{
> 		struct sched_domain *sd;
> 		struct cpumask *lowest_mask = __get_cpu_var(local_cpu_mask);
> 		int this_cpu = smp_processor_id();
> 		int cpu      = task_cpu(task);
> 		cpumask_var_t domain_mask;
> 
> 		if (task->rt.nr_cpus_allowed == 1)
> 			return -1; /* No other targets possible */
> 		[...]
> 	}

And the intent here is to say if the task isn't allowed to move, to not try to compute another
location as task_cpu() is the only valid home.

> 
> 
> What happens if rt.nr_cpus_allowed continues to be 1, but the allowed cpu is
> not the one in which task is being woken up?

The one case that I can think of that would fall into this category is during a sched_setaffinity()?

If so, what happens is that we ultimately call set_cpus_allowed() -> set_cpus_allowed_ptr().  Inside
this function, the new mask is validated independent of the nr_cpus_allowed value and we will
migrate the task away if it is no longer on a valid core.

Are there other cases?

> Since set_cpus_allowed_rt just
> changed the mask and updated the value of nr_cpus_allowed_rt, as far as I 
> can
> see, it remains on same CPU even if the new mask does not allow.
> 
> A possible fix below, and in case it is right:
> 	Signed-off-by: Lucas De Marchi <lucas.de.marchi@...il.com>
> 
>

Until further evidence is presented, I have to respectfully NAK the patch, as I do not think its doing the right thing
nor do I think the current code is actually broken.

I will keep an open mind, though, so if you think there is really a problem, lets discuss.
 
> 
> Lucas De Marchi
> 
> 
> --
> 
> diff --git a/kernel/sched_rt.c b/kernel/sched_rt.c
> index a4d790c..531c946 100644
> --- a/kernel/sched_rt.c
> +++ b/kernel/sched_rt.c
> @@ -962,8 +962,7 @@ static int select_task_rq_rt(struct task_struct
> *p, int sd_flag, int flags)
>  	 * that is just being woken and probably will have
>  	 * cold cache anyway.
>  	 */
> -	if (unlikely(rt_task(rq->curr)) &&
> -	    (p->rt.nr_cpus_allowed > 1)) {
> +	if (unlikely(rt_task(rq->curr))) {
>  		int cpu = find_lowest_rq(p);
> 
>  		return (cpu == -1) ? task_cpu(p) : cpu;
> @@ -1177,7 +1176,8 @@ static int find_lowest_rq(struct task_struct *task)
>  	int cpu      = task_cpu(task);
>  	cpumask_var_t domain_mask;
> 
> -	if (task->rt.nr_cpus_allowed == 1)
> +	if ((task->rt.nr_cpus_allowed == 1) &&
> +			likely(cpumask_test_cpu(cpu, &task->cpus_allowed)))
>  		return -1; /* No other targets possible */
> 
>  	if (!cpupri_find(&task_rq(task)->rd->cpupri, task, lowest_mask))
> 
> ----- End forwarded message -----


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists