[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20100622230036.GA15420@gvim.org>
Date: Tue, 22 Jun 2010 16:00:36 -0700
From: mark gross <640e9920@...il.com>
To: "Rafael J. Wysocki" <rjw@...k.pl>
Cc: Alan Stern <stern@...land.harvard.edu>,
Florian Mickler <florian@...kler.org>,
Linux-pm mailing list <linux-pm@...ts.linux-foundation.org>,
Matthew Garrett <mjg59@...f.ucam.org>,
Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
Dmitry Torokhov <dmitry.torokhov@...il.com>,
Arve Hjønnevåg <arve@...roid.com>,
Neil Brown <neilb@...e.de>, mark gross <640e9920@...il.com>
Subject: Re: [RFC][PATCH] PM: Avoid losing wakeup events during suspend
On Tue, Jun 22, 2010 at 12:21:53PM +0200, Rafael J. Wysocki wrote:
> On Tuesday, June 22, 2010, Rafael J. Wysocki wrote:
> > On Tuesday, June 22, 2010, Alan Stern wrote:
> > > On Mon, 21 Jun 2010, Florian Mickler wrote:
> > >
> > > > > In the end you would want to have communication in both directions:
> > > > > suspend blockers _and_ callbacks. Polling is bad if done too often.
> > > > > But I think the idea is a good one.
> > > >
> > > > Actually, I'm not so shure.
> > > >
> > > > 1. you have to roundtrip whereas in the suspend_blocker scheme you have
> > > > active annotations (i.e. no further action needed)
> > >
> > > That's why it's best to use both. The normal case is that programs
> > > activate and deactivate blockers by sending one-way messages to the PM
> > > process. The exceptional case is when the PM process is about to
> > > initiate a suspend; that's when it does the round-trip polling. Since
> > > the only purpose of the polling is to avoid a race, 90% of the time it
> > > will succeed.
> > >
> > > > 2. it may not be possible for a user to determine if a wake-event is
> > > > in-flight. you would have to somehow pass the wake-event-number with
> > > > it, so that the userspace process could ack it properly without
> > > > confusion. Or... I don't know of anything else...
> > > >
> > > > 1. userspace-manager (UM) reads a number (42).
> > > >
> > > > 2. it questions userspace program X: is it ok to suspend?
> > > >
> > > > [please fill in how userspace program X determines to block
> > > > suspend]
> > > >
> > > > 3a. UM's roundtrip ends and it proceeds to write "42" to the
> > > > kernel [suspending]
> > > > 3b. UM's roundtrip ends and it aborts suspend, because a
> > > > (userspace-)suspend-blocker got activated
> > > >
> > > > I'm not shure how the userspace program could determine that there is a
> > > > wake-event in flight. Perhaps by storing the number of last wake-event.
> > > > But then you need per-wake-event-counters... :|
> > >
> > > Rafael seems to think timeouts will fix this. I'm not so sure.
> > >
> > > > Do you have some thoughts about the wake-event-in-flight detection?
> > >
> > > Not really, except for something like the original wakelock scheme in
> > > which the kernel tells the PM core when an event is over.
> >
> > But the kernel doesn't really know that, so it really can't tell the PM core
> > anything useful. What happens with suspend blockers is that a kernel subsystem
> > cooperates with a user space consumer of the event to get the story straight.
> >
> > However, that will only work if the user space is not buggy and doesn't crash,
> > for example, before releasing the suspend blocker it's holding.
>
> Having reconsidered that I think there's more to it.
>
> Take the PCI subsystem as an example, specifically pcie_pme_handle_request().
> This is the place where wakeup events are started, but it has no idea about
> how they are going to be handled. Thus in the suspend blocker scheme it would
> need to activate a blocker, but it wouldn't be able to release it. So, it
> seems, we would need to associate a suspend blocker with each PCIe device
> that can generate wakeup events and require all drivers of such devices to
> deal with a blocker activated by someone else (PCIe PME driver in this
> particular case). That sounds cumbersome to say the least.
>
> Moreover, even if we do that, it still doesn't solve the entire problem,
> because the event may need to be delivered to user space and processed by it.
> While a driver can check if user space has already read the event, it has
> no way to detect whether or not it has finished processing it. In fact,
> processing an event may involve an interaction with a (human) user and there's
> no general way by which software can figure out when the user considers the
> event as processed.
>
> It looks like user space suspend blockers only help in some special cases
> when the user space processing of a wakeup event is simple enough, but I don't
> think they help in general. For an extreme example, a user may want to wake up
> a system using wake-on-LAN to log into it, do some work and log out, so
> effectively the initial wakeup event has not been processed entirely until the
> user finally logs out of the system. Now, after the system wakeup (resulting
> from the wake-on-LAN signal) we need to give the user some time to log in, but
> if the user doesn't do that in certain time, it may be reasonable to suspend
> and let the user wake up the system again.
>
> Similar situation takes place when the system is woken up by a lid switch.
> Evidently, the user has opened the laptop lid to do something, but we don't
> even know what the user is going to do, so there's no way we can say when
> the wakeup event is finally processed.
>
> So, even if we can say when the kernel has finished processing the event
> (although that would be complicated in the PCIe case above), I don't think
> it's generally possible to ensure that the entire processing of a wakeup event
> has been completed. This leads to the question whether or not it is worth
> trying to detect the ending of the processing of a wakeup event.
>
> Now, going back to the $subject patch, I didn't really think it would be
> suitable for opportunistic suspend, so let's focus on the "forced" suspend
> instead. It still has the problem that wakeup events occuring while
> /sys/power/state is written to (or even slightly before) should cause the
> system to cancel the suspend, but they generally won't. With the patch
> applied that can be avoided by (a) reading from /sys/power/wakeup_count,
> (b) waiting for certain time (such that if a suspend event is not entirely
> processed within that time, it's worth suspending and waking up the
> system again) and (c) writing to /sys/power/wakeup_count right before writing
> to /sys/power/state (where the latter is only done if the former succeeds).
>
This is what thought was the problem your idea as trying to deal with.
--mgross
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists