lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20141209123556.02cc99c0@thinkpad-w530>
Date:	Tue, 9 Dec 2014 12:35:56 +0100
From:	David Hildenbrand <dahi@...ux.vnet.ibm.com>
To:	Heiko Carstens <heiko.carstens@...ibm.com>
Cc:	"Paul E. McKenney" <paulmck@...ux.vnet.ibm.com>,
	linux-kernel@...r.kernel.org, borntraeger@...ibm.com,
	rafael.j.wysocki@...el.com, peterz@...radead.org, oleg@...hat.com,
	bp@...e.de, jkosina@...e.cz
Subject: Re: [PATCH v2] CPU hotplug: active_writer not woken up in some
 cases - deadlock

> On Tue, Dec 09, 2014 at 11:11:01AM +0100, David Hildenbrand wrote:
> > > > Therefore we have to move the condition check inside the 
> > > >   __set_current_state(TASK_UNINTERRUPTIBLE) -> schedule();
> > > > section to not miss any wake ups when the condition is satisfied.
> > > > 
> > > > So wake_up_process() will either see TASK_RUNNING and do nothing or see
> > > > TASK_UNINTERRUPTIBLE and set it to TASK_RUNNING, so schedule() will in
> > > > fact be woken up again.
> > > 
> > > Or the third alternative would be that 'active_writer' which was running
> > > on CPU2 already terminated and wake_up_process() has a non-NULL pointer to
> > > task_struct which is already dead.
> > > Or is there anything that prevents this use-after-free race?
> > 
> > Hmmm ... I think that is also a valid scenario.
> > That would mean we need soemthing like this:
> > 
> >  void put_online_cpus(void)
> >  {
> > + struct task_struct *awr;
> > +
> >         if (cpu_hotplug.active_writer == current)
> >                 return;
> >         if (!mutex_trylock(&cpu_hotplug.lock)) {
> > +         awr = ACCESS_ONCE(cpu_hotplug.active_writer);
> > +         if (unlikely(awr))
> > +                 get_task_struct(awr);
> 
> How would this solve the problem?

Although this might fix the problem you addressed, it exposes another one:

CPU1                               CPU2
----------------------------------------------------------------------------
!mutex_trylock(&cpu_hotplug.lock) |
cpu_hotplug.active_writer == 0    |
awr = 0;                          |
                                  | cpu_hotplug.active_writer = current
                                  | __set_current_state(TASK_UNINTERRUPTIBLE);
                                  | cpu_hotplug.puts_pending == 0
cpu_hotplug.puts_pending++;       | ...
                                  | schedule();
/* no wakeup as awr == 0 */

So we really need to cpu_hotplug.puts_pending++; before checking for
cpu_hotplug.active_writer. That in turn can lead to the active_writer struct vanishing.

So we can't get around a lock for cpu_hotplug.active_writer IMHO. Or we have to
revert the original patch - but that one addressed an rcu problem.

Opinions?

David

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ