[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <20171115192529.GA14158@zipoli.concurrent-rt.com>
Date:   Wed, 15 Nov 2017 14:25:29 -0500
From:   joe.korty@...current-rt.com
To:     Thomas Gleixner <tglx@...utronix.de>,
        Peter Zijlstra <a.p.zijlstra@...llo.nl>,
        Steven Rostedt <rostedt@...dmis.org>
Cc:     Linux Kernel Mailing List <linux-kernel@...r.kernel.org>
Subject: [PATCH] 4.4.86-rt99: fix sync breakage between nr_cpus_allowed and
 cpus_allowed
4.4.86-rt99's patch
  0037-Intrduce-migrate_disable-cpu_light.patch
introduces a place where a task's cpus_allowed mask is
updated without a corresponding update to nr_cpus_allowed.
This path is executed when task affinity is changed while
migrate_disabled() is true.  As there is no code present
to set nr_cpus_allowed when the migrate_disable state is
dropped, the scheduler at that point on may make incorrect
scheduling decisions for this task.
My testing consists of temporarily adding a
 if (tsk_nr_cpus_allowed(p) == cpumask_weight(tsk_cpus_allowed(p))
 	printk_ratelimited(...)
stmt to schedule() and running a simple affinity rotation
program I wrote, one that rotates the threads of stress(1).
While rotating, I got the expected kernel error messages.
With this patch applied the messages disappeared.
Signed-off-by: Joe Korty <joe.korty@...current-rt.com>
Index: b/kernel/sched/core.c
===================================================================
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -1220,6 +1220,7 @@ void do_set_cpus_allowed(struct task_str
 	lockdep_assert_held(&p->pi_lock);
 
 	if (__migrate_disabled(p)) {
+		p->nr_cpus_allowed = cpumask_weight(new_mask);
 		cpumask_copy(&p->cpus_allowed, new_mask);
 		return;
 	}
Powered by blists - more mailing lists