lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20131119195851.26bee8e6@notabene.brown>
Date:	Tue, 19 Nov 2013 19:58:51 +1100
From:	NeilBrown <neilb@...e.de>
To:	Peter Zijlstra <peterz@...radead.org>,
	Jonathan Corbet <corbet@....net>, Dean Nelson <dcn@....com>
Cc:	Andrew Morton <akpm@...ux-foundation.org>,
	Ingo Molnar <mingo@...hat.com>,
	Thomas Gleixner <tglx@...utronix.de>,
	lkml <linux-kernel@...r.kernel.org>,
	"Dr. H. Nikolaus Schaller" <hns@...delico.com>,
	Marek Belisko <marek@...delico.com>,
	Mark Brown <broonie@...nel.org>
Subject: Re: [PATCH/RFC] wait_for_completion_timeout() considered harmful.

On Tue, 19 Nov 2013 09:25:48 +0100 Peter Zijlstra <peterz@...radead.org>
wrote:

> On Tue, Nov 19, 2013 at 10:44:38AM +1100, NeilBrown wrote:
> > We have loops that have
> >     timeout = schedule_timeout(timeout)
> > in the middle and if we change the semantics of schedule_timeout() to round
> > up, those loops could wait quite a bit longer than expected.
> 
> Depends on what you expect; most of these functions have documentation
> that says they will sleep at least timeout amount of time.
> 
> schedule_timeout()'s version looks like:
> 
>  * Make the current task sleep until @timeout jiffies have
>  * elapsed.
> 
> Clearly it doesn't do that currently, so adding 1 will actually make it
> do what it says on the tin.

Hmm.. maybe you are right, and I now see that my argument about loops isn't
nearly as convincing as it seemed at the time.

However there is still the risk that someone wrote code depending on the
current behaviour rather than the current documentation.  msleep() is one
such example (it has a "+1") but that is easily found and fixed.

I went hunting for code that uses a timeout of '1' which would be affected
the most (I found no use cases for '0').  Two interesting examples:

drivers/video/via/via-core.c: 

	/*
	 * Now we just wait until the interrupt handler says
	 * we're done.  Except that, actually, we need to wait a little
	 * longer: the interrupts seem to jump the gun a little and we
	 * get corrupted frames sometimes.
	 */
	wait_for_completion_timeout(&viafb_dma_completion, 1);
	msleep(1);

Maybe the 'msleep(1)' was only needed because of this 'bug' - Jon might know.
In that case the comment and  msleep could get removed.

drivers/misc/sgi-xp/xpc_channel.c:

/*
 * Wait for a message entry to become available for the specified channel,
 * but don't wait any longer than 1 jiffy.
 */

.....
	atomic_inc(&ch->n_on_msg_allocate_wq);
	ret = interruptible_sleep_on_timeout(&ch->msg_allocate_wq, 1);
	atomic_dec(&ch->n_on_msg_allocate_wq);

If interruptible_sleep_on_timeout were affected by the proposed change, then
the code would no longer match the comment.  That code is five years old ...
I wonder Dean remembers if it is important ... or is even still at SGI.

Also
     git grep '_timeout[_a-z]*(.*msecs_to_jiffies(1)'

finds 10 matches which would be significantly affected.  I can't easily say
if it would be for the better or not.

Another random piece of code that maybe could get simplified:
drivers/iio/light/tsl2563.c:
	/*
	 * TODO: Make sure that we wait at least required delay but why we
	 * have to extend it one tick more?
	 */
	schedule_timeout_interruptible(msecs_to_jiffies(delay) + 2);


NeilBrown

Download attachment "signature.asc" of type "application/pgp-signature" (829 bytes)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ