[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20151009164914.GA11947@redhat.com>
Date: Fri, 9 Oct 2015 18:49:14 +0200
From: Oleg Nesterov <oleg@...hat.com>
To: Peter Zijlstra <peterz@...radead.org>
Cc: heiko.carstens@...ibm.com, Tejun Heo <tj@...nel.org>,
Ingo Molnar <mingo@...nel.org>, Rik van Riel <riel@...hat.com>,
Thomas Gleixner <tglx@...utronix.de>,
linux-kernel@...r.kernel.org,
Vitaly Kuznetsov <vkuznets@...hat.com>
Subject: Re: [PATCH 3/3] sched: start stopper early
On 10/09, Oleg Nesterov wrote:
>
> From: Peter Zijlstra <peterz@...radead.org>
Peter, I tried to compromise you.
> case CPU_ONLINE:
> + stop_machine_unpark(cpu);
> /*
> * At this point a starting CPU has marked itself as online via
> * set_cpu_online(). But it might not yet have marked itself
> @@ -5337,7 +5340,7 @@ static int sched_cpu_active(struct notifier_block *nfb,
> * Thus, fall-through and help the starting CPU along.
> */
> case CPU_DOWN_FAILED:
> - set_cpu_active((long)hcpu, true);
> + set_cpu_active(cpu, true);
On a second thought, we can't do this (and your initial change has
the same problem).
We can not wakeup it before set_cpu_active(). This can lead to the
same problem fixed by dd9d3843755da95f6 "sched: Fix cpu_active_mask/
cpu_online_mask race". The stopper thread can hit
BUG_ON(td->cpu != smp_processor_id()) in smpboot_thread_fn().
Easy to fix, CPU_ONLINE should do set_cpu_active() itself and not
fall through to CPU_DOWN_FAILED,
case CPU_ONLINE:
set_cpu_active(cpu, true);
stop_machine_unpark(cpu);
break;
But. This is another proof that stop_two_cpus() must not rely on
cpu_active().
Right?
Oleg.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists