lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Mon, 3 Mar 2008 17:32:52 +0100
From:	"Rafael J. Wysocki" <rjw@...k.pl>
To:	Alan Stern <stern@...land.harvard.edu>
Cc:	Linux-pm mailing list <linux-pm@...ts.linux-foundation.org>,
	Kernel development list <linux-kernel@...r.kernel.org>,
	Alexey Starikovskiy <astarikovskiy@...e.de>
Subject: Re: [linux-pm] Fundamental flaw in system suspend, exposed by freezer removal

On Monday, 3 of March 2008, Alan Stern wrote:
> On Sun, 2 Mar 2008, Rafael J. Wysocki wrote:
[--snip--]
> 
> > > The prevent_new_children methods should be called in a separate initial
> > > forward pass through the dpm_active list -- rather like the reverted
> > > lock_all_devices() routine.  Similarly for the allow_new_children 
> > > methods (a final backward pass like unlock_all_devices()).
> > 
> > Agreed.
> 
> After more thought, I'm not so sure about this.  It might be a good
> idea to call the begin_sleep method just before the suspend method (or
> any of its variants: freeze, hibernate, prethaw, etc.) and call the
> end_sleep method just after the resume method.  This minimizes the time
> drivers will spend in a peculiar non-hotplug-aware state, although it 
> means that begin_sleep would have to be idempotent.
> 
> It also allows sophisticated drivers to do all their processing in the
> begin_sleep (and end_sleep) method: both preventing new child
> registrations and powering down the device.  At the moment I'm not sure
> whether this would turn out to be a good strategy, but it might.

Well, I think there should be a window between ->suspend_begin()
and ->suspend() allowing the core to cancel the suspending of given
device and to select another one.  For example, if there's a child
registration concurrent wrt ->suspend_begin() which completes after
->suspend_begin() has been called, but before ->suspend_begin() has a chance
to block it, the core should not call ->suspend() for the device, but select
another one (the child).

Of course, it won't be necessary if the ->suspend_begin() methods are called
in an initial forward pass through dpm_active.

> Alternatively, some subsystems might want to stop all child
> registrations right at the beginning of the sleep transition.  This
> would amount to blocking the kernel thread responsible for those
> registrations when the sleep begins -- in other words, making that
> thread freezable!

Well, I think the possibility to freeze kernel threads on demand may be
considered as one of suspend synchronization mechanisms available to drivers.

[--snip--] 
> > > Actually you can simplify the whole thing by getting rid of 
> > > dev->power.lock entirely.  Protect the "sleeping" flag by dpm_list_mtx.
> > 
> > Then, if a child registration occurs while suspend_device(parent) is being
> > run, dpm_active will be in a wrong order, unless suspend_device(parent) returns
> > an error (which, BTW, also imposes a new rule on drivers).
> 
> Sorry, I don't follow you.
> 
> I meant doing something approximately like this:
> 
> 	mutex_lock(&dpm_list_mtx);
> 	while (!list_is_empty(&dpm_active)) {
> 		dev = dpm_active.prev;
> 		dev->power.sleeping = true;
> 		mutex_unlock(&dpm_list_mtx);
> 		error = suspend_device(dev);
> 		mutex_lock(&dpm_list_mtx);
> 		...
> 	}
> 
> and make device_pm_add() fail if dev->parent->power.sleeping is true.
> This will do what you want, right?

Yes, that should work.  I'll rework the patch along these lines (it starts to
look really simple now :-); I'll post the new version in a separate thread).

> With the begin_sleep method call added it gets a little more
> complicated, since you have to reacquire dpm_list_mtx after calling the
> begin_sleep method and before setting dev->power.sleeping to true, in
> order to make sure that dev is still the last entry on dpm_active.  
> But it's quite doable.

Yes, it is.

> (BTW, I wonder if it's a good idea for device_add() to call 
> device_pm_add() before calling bus_add_device().  If a suspend occurs 
> in between, we could end up in a strange situation with a driver being 
> asked to suspend a device before that device has been fully registered 
> -- in fact the registration might still fail.)

That's correct.  Perhaps we should change device_add().

> > I'm not sure the "GONE" state willl really be necessary.
> 
> I'm not sure either.  You can get the same effect by checking 
> list_empty(&dev->power.entry).
> 
> > Well, we can add new callbacks for notifying drivers of an impending suspend.
> > In that case, say we add a ->begin() callback for this purpose (in fact that
> > would be two callbacks, ->suspend_begin() and ->hibernate_begin(), but let's
> > simplify things a bit for now), so there are the following questions:
> 
> In theory you could even expand it to freeze_begin and prethaw_begin.

I meant ->hibernate_begin() as ->freeze_begin() and I don't see any reason why
->prethaw_begin() would be different from it (the difference between ->freeze()
and ->prethaw() -- now called ->quiesce(), btw -- is that the former saves
device settings and the latter doesn't).
 
> > * Is it going to return a result?
> > * If it is, should we fail the suspend if an error is returned?
> > * In that case, should the ->suspend() callback return a result?
> 
> To be safe, everything should return a result and we should abort the 
> sleep if anything returns an error.
> 
> It's easier to ignore a return code now than to change a method
> signature later.  :-)
> 
> > * Perhaps we can require ->suspend() to always succeed after ->begin() has
> >   succeeded?
> 
> No.  Some drivers might implement just one and some drivers just the 
> other.

I thought ->suspend() would be mandatory, even if it's to be empty.

> > OTOH, IMO requiring ->suspend() to return an error if there's been a concurrent
> > child registration and resuming the device if that happens is not a trivial
> > change and it may require as many driver modifications as failing the
> > registration of the child right away.
> 
> That's true.  If the driver's author wants to do things that way, he
> can.  For instance, there are several subsystems which will probe for
> new children as part of their resume processing.  So if a child was
> detected just before the system went to sleep and failed to get
> registered, then it will be detected again when the system wakes up and
> now the registration will succeed.
> 
> Here's something else to think about.  We might want to allow some 
> devices to be "power-irrelevant".  That is, the device exists in the 
> kernel only as a representation of some software state, not as a 
> physical device.  It doesn't consume power, it doesn't have any state 
> to lose during a sleep, and its driver doesn't implement suspend or 
> resume methods.  For these sorts of devices, we might allow 
> device_add() to skip calling device_pm_add() altogether.  USB 
> interfaces are a little like this, as are SCSI hosts and MMC hosts.

If such devices serve as logical parents of some "real" ones, we should
at least mark them as "sleeping" during suspend to protect the dpm_active
ordering.

Thanks,
Rafael
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ