lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Thu, 01 Mar 2007 16:49:34 +0200
From:	Eli Cohen <eli@...lanox.co.il>
To:	Dmitry Adamushko <dmitry.adamushko@...il.com>
Cc:	Linux Kernel <linux-kernel@...r.kernel.org>
Subject: Re: wait_for_completion_timeout problem ???

On Thu, 2007-03-01 at 15:28 +0100, Dmitry Adamushko wrote:
> Hi,
> 
> >
> > I have a problem with using this function. I am referring to
> > drivers/infiniband/hw/mthca/mthca_cmd.c line 394. For convenience I
> > quote from this code:
> >
> >         init_completion(&context->done);
> >
> >         err = mthca_cmd_post(dev, in_param,
> >                              out_param ? *out_param : 0,
> >                              in_modifier, op_modifier,
> >                              op, context->token, 1);
> >         if (err)
> >                 goto out;
> >
> >         if (!wait_for_completion_timeout(&context->done, timeout)) {
> >                 err = -EBUSY;
> >                 goto out;
> >         }
> >
> > timeout is 10 * HZ. Sometimes this function returns 0 which signifies
> > timeout. However I can see that the interrupt handler called
> > complete(&context->done)
> > around 200 usec after calling wait_for_completion_timout(). When the
> > function returns I can see that context->done.done equals 1 which
> > confirms that complete was indeed called.
> 
> The sequence of events can be as follows:
> 
> a caller gets blocked in wait_for_completion_timeout() on
> schedule_timeout() which literally means:
> 
>    i ) will be unblocked (scheduled back) after "timeout" has expired;
> 
>    ii) will be unblocked by someone calling wake_up_*(&x->wait);
> 
> (wait_for_completion_timeout() inserted our caller into "x->wait" wait queue)
> 
> in both cases schedule_timeout() will do
> 
> ...
>         schedule(); <------------------ here we get CPU back
>         del_singleshot_timer_sync(&timer);
>         timeout = expire - jiffies;
> 
>  out:
>         return timeout < 0 ? 0 : timeout;
> 
> "expire" is when (+latency) we were expected to be woken up by a
> timeer -> timeout.
> 
> Now the point is that our waiter could have been "waken up" (become
> "ready" from the point of view of the scheduler) earlier but it was
> just "scheduled" (got CPU back) later than "expire" so that's why the
> return value is 0 (timeout < 0 ==> return 0).
> 
> IOW, schedule_timeout() indicates whether a process has been scheduled
> back /earlier than timeout/ (so return value >0) or /later/ (0).
> 
> It doesn't indicate why the process has been woked up ( i.e. (i) or
> (ii) above ).
> 
> In you case it became /runnable/ because of complete() but it got
> scheuled later than /timeout/.
> 
> And wait_for_completion_timeout() takes it as a /timeout condition/.
> 
> So either all the users of wait_for_completion_timeout() should
> additionally check for x->done after they got scheduled
There is only one process waiting on the completion object.
> 
> or
> 
> wait_for_completion_timeout() should return something different that
> encodes the fact /event happened/ and not just /event happened _and_ a
> caller has got scheduled back earlier than timeout.
That possibility crossed my mind but I ruled it out since the the
timeout is 10 sec while completion arrived less then 4 msec after
calling wait_for_completion_timeout(). So it appears to me unlikely that
it will the task will be in the run list but will not get CPU time for
nearly 10 seconds. I will give it a try though. I will increase timeout
to 20 sec.

> 
> 
> 
> >
> > Thanks
> > Eli
> 
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ