[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <b647ffbd0703010628u253a62cfi56cec22a1ee79d5f@mail.gmail.com>
Date: Thu, 1 Mar 2007 15:28:40 +0100
From: "Dmitry Adamushko" <dmitry.adamushko@...il.com>
To: eli@...lanox.co.il
Cc: "Linux Kernel" <linux-kernel@...r.kernel.org>
Subject: Re: wait_for_completion_timeout problem ???
Hi,
>
> I have a problem with using this function. I am referring to
> drivers/infiniband/hw/mthca/mthca_cmd.c line 394. For convenience I
> quote from this code:
>
> init_completion(&context->done);
>
> err = mthca_cmd_post(dev, in_param,
> out_param ? *out_param : 0,
> in_modifier, op_modifier,
> op, context->token, 1);
> if (err)
> goto out;
>
> if (!wait_for_completion_timeout(&context->done, timeout)) {
> err = -EBUSY;
> goto out;
> }
>
> timeout is 10 * HZ. Sometimes this function returns 0 which signifies
> timeout. However I can see that the interrupt handler called
> complete(&context->done)
> around 200 usec after calling wait_for_completion_timout(). When the
> function returns I can see that context->done.done equals 1 which
> confirms that complete was indeed called.
The sequence of events can be as follows:
a caller gets blocked in wait_for_completion_timeout() on
schedule_timeout() which literally means:
i ) will be unblocked (scheduled back) after "timeout" has expired;
ii) will be unblocked by someone calling wake_up_*(&x->wait);
(wait_for_completion_timeout() inserted our caller into "x->wait" wait queue)
in both cases schedule_timeout() will do
...
schedule(); <------------------ here we get CPU back
del_singleshot_timer_sync(&timer);
timeout = expire - jiffies;
out:
return timeout < 0 ? 0 : timeout;
"expire" is when (+latency) we were expected to be woken up by a
timeer -> timeout.
Now the point is that our waiter could have been "waken up" (become
"ready" from the point of view of the scheduler) earlier but it was
just "scheduled" (got CPU back) later than "expire" so that's why the
return value is 0 (timeout < 0 ==> return 0).
IOW, schedule_timeout() indicates whether a process has been scheduled
back /earlier than timeout/ (so return value >0) or /later/ (0).
It doesn't indicate why the process has been woked up ( i.e. (i) or
(ii) above ).
In you case it became /runnable/ because of complete() but it got
scheuled later than /timeout/.
And wait_for_completion_timeout() takes it as a /timeout condition/.
So either all the users of wait_for_completion_timeout() should
additionally check for x->done after they got scheduled
or
wait_for_completion_timeout() should return something different that
encodes the fact /event happened/ and not just /event happened _and_ a
caller has got scheduled back earlier than timeout.
>
> Thanks
> Eli
--
Best regards,
Dmitry Adamushko
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists