[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <Pine.LNX.4.44L0.0912071610520.15701-100000@iolanthe.rowland.org>
Date: Mon, 7 Dec 2009 16:32:10 -0500 (EST)
From: Alan Stern <stern@...land.harvard.edu>
To: Linus Torvalds <torvalds@...ux-foundation.org>
cc: Zhang Rui <rui.zhang@...el.com>, "Rafael J. Wysocki" <rjw@...k.pl>,
LKML <linux-kernel@...r.kernel.org>,
ACPI Devel Maling List <linux-acpi@...r.kernel.org>,
pm list <linux-pm@...ts.linux-foundation.org>
Subject: Re: [GIT PULL] PM updates for 2.6.33
On Mon, 7 Dec 2009, Linus Torvalds wrote:
> > The consequence is that there's no way to hand off an entire subtree to
> > an async thread. And as a result, your single-pass algorithm runs into
> > the kind of "stall" problem I described before.
>
> No, look again. There's no stall in the thing, because all it really
> depends on is (for the suspend path) is that it sees all children before
> the parent (because the child will do a "down_read()" on the parent node
> and that should not stall), and for the resume path it depends on seeing
> the parent node before any children (because the parent node does that
> "down_write()" on its own node).
>
> Everything else is _entirely_ asynchronous, including all the other locks
> it takes. So there are no stalls (except, of course, if we then hit limits
> on numbers of outstanding async work and refuse to create too many
> outstanding async things, but that's a separate issue, and intentional, of
> course).
It only seems that way because you didn't take into account devices
that suspend synchronously but whose children suspend asynchronously.
A synchronous suspend routine for a device with async child suspends
would have to look just like your usb_node_suspend():
suspend_one_node(dev)
{
/* Wait until the children are suspended */
down_write(dev->lock);
Suspend dev
up_write(dev->lock);
/* Allow the parent to suspend */
up_read(dev->parent->lock);
}
So now suppose we've got two USB host controllers, A and B. They are
PCI devices, so they suspend synchronously. Each has a root hub child
(P and Q respectively) which is a USB device and therefore suspends
asynchronously. dpm_list contains: A, P, B, Q. (In fact A doesn't
enter into this discussion; you can ignore it.)
In your one-pass algorithm, we start with usb_node_suspend(Q). It does
down_read(B->lock) and starts an async task for Q. Then we move on to
suspend_one_node(B). It does down_write(B->lock) and blocks until the
async task finishes; then it suspends B. Finally we move on to
usb_node_suspend(P), which does down_read(A->lock) and starts an async
task for P.
The upshot is that P is stuck waiting for Q to suspend, even though it
should have been able to suspend in parallel. This is simply because P
precedes B in the list, and B is synchronous and must wait for Q to
finish.
With my two-pass algorithm, we start with Q. The first loop does
down_read(B->lock) and starts an async task for Q. We move on to B and
do down_read(B->parent->lock), nothing more. Then we move to to P,
with down_read(A->lock) and start an async task for P. Finally we do
down_read(A->parent->lock). Notice that now there are two async tasks,
for P and Q, running in parallel.
The second pass waits for Q to finish before suspending B
synchronously, and waits for P to finish before suspending A
synchronously. This is unavoidable. The point is that it allows P and
Q to suspend at the same time, not one after the other as in the
one-pass scheme.
Alan Stern
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists