[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <30ab713901ef0e1f23c1ca387373788a4a73639f.camel@redhat.com>
Date: Fri, 13 Dec 2019 00:44:22 -0600
From: Scott Wood <swood@...hat.com>
To: Sebastian Andrzej Siewior <bigeasy@...utronix.de>,
LKML <linux-kernel@...r.kernel.org>
Cc: linux-rt-users <linux-rt-users@...r.kernel.org>,
Steven Rostedt <rostedt@...dmis.org>,
Thomas Gleixner <tglx@...utronix.de>
Subject: Re: [PATCH RT] sched: migrate_enable: Busy loop until the migration
request is completed
On Thu, 2019-12-12 at 12:27 +0100, Sebastian Andrzej Siewior wrote:
> If user task changes the CPU affinity mask of a running task it will
> dispatch migration request if the current CPU is no longer allowed. This
> might happen shortly before a task enters a migrate_disable() section.
> Upon leaving the migrate_disable() section, the task will notice that
> the current CPU is no longer allowed and will will dispatch its own
> migration request to move it off the current CPU.
> While invoking __schedule() the first migration request will be
> processed and the task returns on the "new" CPU with "arg.done = 0". Its
> own migration request will be processed shortly after and will result in
> memory corruption if the stack memory, designed for request, was used
> otherwise in the meantime.
Ugh.
> Spin until the migration request has been processed if it was accepted.
>
> Signed-off-by: Sebastian Andrzej Siewior <bigeasy@...utronix.de>
> ---
> kernel/sched/core.c | 7 +++++--
> 1 file changed, 5 insertions(+), 2 deletions(-)
>
> diff --git a/kernel/sched/core.c b/kernel/sched/core.c
> index 8bea013b2baf5..5c7be96ca68c4 100644
> --- a/kernel/sched/core.c
> +++ b/kernel/sched/core.c
> @@ -8227,7 +8227,7 @@ void migrate_enable(void)
>
> WARN_ON(smp_processor_id() != cpu);
> if (!is_cpu_allowed(p, cpu)) {
> - struct migration_arg arg = { p };
> + struct migration_arg arg = { .task = p };
> struct cpu_stop_work work;
> struct rq_flags rf;
>
> @@ -8239,7 +8239,10 @@ void migrate_enable(void)
> stop_one_cpu_nowait(task_cpu(p), migration_cpu_stop,
> &arg, &work);
> __schedule(true);
> - WARN_ON_ONCE(!arg.done && !work.disabled);
> + if (!work.disabled) {
> + while (!arg.done)
> + cpu_relax();
> + }
We should enable preemption while spinning -- besides the general badness
of spinning with it disabled, there could be deadlock scenarios if
multiple CPUs are spinning in such a loop. Long term maybe have a way to
dequeue the no-longer-needed work instead of waiting.
-Scott
Powered by blists - more mailing lists