lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <817dd145-fba5-d5e8-26b8-746b9bab4dd9@redhat.com>
Date:   Mon, 10 Sep 2018 10:55:47 +0200
From:   Paolo Bonzini <pbonzini@...hat.com>
To:     Eric Wong <normalperson@...t.net>,
        Al Viro <viro@...IV.linux.org.uk>
Cc:     linux-fsdevel@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: [RFC] pipe: prevent compiler reordering in pipe_poll

On 25/08/2018 00:54, Eric Wong wrote:
> The pipe_poll function does not use locks, and adding an entry
> to the waitqueue is not guaranteed to happen before pipe->nrbufs
> (or other fields) are read, leading to missed wakeups.
> 
> Looking at Ruby CI build logs and backtraces, I've noticed
> occasional instances where processes are stuck in select(2) or
> ppoll(2) with a pipe.
> 
> I don't have access to the systems where this is happening to
> test/reproduce the problem, and haven't been able to reproduce
> it locally on less-powerful hardware, either.  However, it seems
> like a problem based on similar comments in
> fs/eventfd.c::eventfd_poll made by Paolo.

The documentation change can be useful, but if you add a compiler
barrier you should also mention why reordering at the processor level is
okay.  In this case, processor-level reordering is okay because (just
like in fs/eventfd.c) poll_wait acts as an acquire barrier.

*However* I would be surprised if the scenario (even the one in
fs/eventfd.c) can actually happen, and I don't think the compiler
barrier is useful; there's no reason why the compiler should think that
it can hoist the reads above poll_wait.

In fact, there is a big difference between READ_ONCE() and barrier() for
whoever reads the code, which makes the code after your patch worse than
before.  READ_ONCE() means "I know I am accessing this variable outside
a lock".  barrier() means one of two things: 1) "I know what I am doing
can trick the compiler, and I don't want that to happen"; 2) "I am
synchronizing against other things happening on this CPU" such as
interrupts.  In this case you are not doing any of the two.

Paolo


> Signed-off-by: Eric Wong <normalperson@...t.net>
> Cc: Paolo Bonzini <pbonzini@...hat.com>
> ---
>  fs/pipe.c | 32 ++++++++++++++++++++++++++++++--
>  1 file changed, 30 insertions(+), 2 deletions(-)
> 
> diff --git a/fs/pipe.c b/fs/pipe.c
> index 39d6f431da83..1a904d941cf1 100644
> --- a/fs/pipe.c
> +++ b/fs/pipe.c
> @@ -509,7 +509,7 @@ static long pipe_ioctl(struct file *filp, unsigned int cmd, unsigned long arg)
>  	}
>  }
>  
> -/* No kernel lock held - fine */
> +/* No kernel lock held - fine, but a compiler barrier is required */
>  static __poll_t
>  pipe_poll(struct file *filp, poll_table *wait)
>  {
> @@ -519,7 +519,35 @@ pipe_poll(struct file *filp, poll_table *wait)
>  
>  	poll_wait(filp, &pipe->wait, wait);
>  
> -	/* Reading only -- no need for acquiring the semaphore.  */
> +	/*
> +	 * Reading only -- no need for acquiring the semaphore, but
> +	 * we need a compiler barrier to ensure the compiler does
> +	 * not reorder reads to pipe->nrbufs, pipe->writers,
> +	 * pipe->readers, filp->f_version, pipe->w_counter, and
> +	 * pipe->buffers before poll_wait to avoid missing wakeups
> +	 * from compiler reordering.  In other words, we need to
> +	 * prevent the following situation:
> +	 *
> +	 * pipe_poll                          pipe_write
> +	 * -----------------                  ------------
> +	 * nrbufs = pipe->nrbufs (INVALID!)
> +	 *
> +	 *                                    __pipe_lock
> +	 *                                    pipe->nrbufs = ++bufs;
> +	 *                                    __pipe_unlock
> +	 *                                    wake_up_interruptible_sync_poll
> +	 *                                      pipe->wait is empty, no wakeup
> +	 *
> +	 * lock pipe->wait.lock (in poll_wait)
> +	 * __add_wait_queue
> +	 * unlock pipe->wait.lock
> +	 *
> +	 *  // pipe->nrbufs should be read here, NOT above
> +	 *
> +	 * pipe_poll returns 0 (WRONG)
> +	 */
> +	barrier();
> +
>  	nrbufs = pipe->nrbufs;
>  	mask = 0;
>  	if (filp->f_mode & FMODE_READ) {
> 

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ