[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20141209120442.29de5b22@thinkpad-w530>
Date: Tue, 9 Dec 2014 12:04:42 +0100
From: David Hildenbrand <dahi@...ux.vnet.ibm.com>
To: Heiko Carstens <heiko.carstens@...ibm.com>
Cc: "Paul E. McKenney" <paulmck@...ux.vnet.ibm.com>,
linux-kernel@...r.kernel.org, borntraeger@...ibm.com,
rafael.j.wysocki@...el.com, peterz@...radead.org, oleg@...hat.com,
bp@...e.de, jkosina@...e.cz
Subject: Re: [PATCH v2] CPU hotplug: active_writer not woken up in some
cases - deadlock
> On Tue, Dec 09, 2014 at 11:11:01AM +0100, David Hildenbrand wrote:
> > > > Therefore we have to move the condition check inside the
> > > > __set_current_state(TASK_UNINTERRUPTIBLE) -> schedule();
> > > > section to not miss any wake ups when the condition is satisfied.
> > > >
> > > > So wake_up_process() will either see TASK_RUNNING and do nothing or see
> > > > TASK_UNINTERRUPTIBLE and set it to TASK_RUNNING, so schedule() will in
> > > > fact be woken up again.
> > >
> > > Or the third alternative would be that 'active_writer' which was running
> > > on CPU2 already terminated and wake_up_process() has a non-NULL pointer to
> > > task_struct which is already dead.
> > > Or is there anything that prevents this use-after-free race?
> >
> > Hmmm ... I think that is also a valid scenario.
> > That would mean we need soemthing like this:
> >
> > void put_online_cpus(void)
> > {
> > + struct task_struct *awr;
> > +
> > if (cpu_hotplug.active_writer == current)
> > return;
> > if (!mutex_trylock(&cpu_hotplug.lock)) {
> > + awr = ACCESS_ONCE(cpu_hotplug.active_writer);
> > + if (unlikely(awr))
> > + get_task_struct(awr);
>
> How would this solve the problem?
If I am not completely wrong, an active_writer will remain in
it's loop (cpu_hotplug_begin) until the refcount is down to 0. As we are
putting the cpus, the refcount is > 0 (because of the previous get_all_cpus()
which incremented the refcount).
cpu_hotplug_begin will only be able to exit as soon as refcount == 0, therefore
in our special case if cpu_hotplug.puts_pending has been incremented.
As long as we don't increment cpu_hotplug.puts_pending, the active_writer will
not vanish. Therefore awr still points to a valid task struct after we
incremented cpu_hotplug.puts_pending.
get_task_struct() will make sure that the struct will not vanish after we
incremented cpu_hotplug.puts_pending (and therefore decremented the refcount).
Or am I missing something?
Thanks!
David
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists