lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20181017235532.GC4568@redhat.com>
Date:   Wed, 17 Oct 2018 19:55:32 -0400
From:   Andrea Arcangeli <aarcange@...hat.com>
To:     Christoph Hellwig <hch@...radead.org>
Cc:     syzbot <syzbot+5b1df0420c523b45a953@...kaller.appspotmail.com>,
        bcrl@...ck.org, linux-aio@...ck.org, linux-fsdevel@...r.kernel.org,
        linux-kernel@...r.kernel.org, syzkaller-bugs@...glegroups.com,
        viro@...iv.linux.org.uk, akpm@...ux-foundation.org
Subject: Re: possible deadlock in aio_poll

On Mon, Sep 10, 2018 at 09:53:17AM -0700, Christoph Hellwig wrote:
> On Mon, Sep 10, 2018 at 12:41:05AM -0700, syzbot wrote:
> > =====================================================
> > WARNING: SOFTIRQ-safe -> SOFTIRQ-unsafe lock order detected
> > 4.19.0-rc2+ #229 Not tainted
> > -----------------------------------------------------
> > syz-executor2/9399 [HC0[0]:SC0[0]:HE0:SE1] is trying to acquire:
> > 00000000126506e0 (&ctx->fd_wqh){+.+.}, at: spin_lock
> > include/linux/spinlock.h:329 [inline]
> > 00000000126506e0 (&ctx->fd_wqh){+.+.}, at: aio_poll+0x760/0x1420
> > fs/aio.c:1747
> > 
> > and this task is already holding:
> > 000000002bed6bf6 (&(&ctx->ctx_lock)->rlock){..-.}, at: spin_lock_irq
> > include/linux/spinlock.h:354 [inline]
> > 000000002bed6bf6 (&(&ctx->ctx_lock)->rlock){..-.}, at: aio_poll+0x738/0x1420
> > fs/aio.c:1746
> > which would create a new lock dependency:
> >  (&(&ctx->ctx_lock)->rlock){..-.} -> (&ctx->fd_wqh){+.+.}
> 
> ctx->fd_wqh seems to only exist in userfaultfd, which indeed seems
> to do strange open coded waitqueue locking, and seems to fail to disable
> irqs.  Something like this should fix it:
> 
> diff --git a/fs/userfaultfd.c b/fs/userfaultfd.c
> index bfa0ec69f924..356d2b8568c1 100644
> --- a/fs/userfaultfd.c
> +++ b/fs/userfaultfd.c
> @@ -1026,7 +1026,7 @@ static ssize_t userfaultfd_ctx_read(struct userfaultfd_ctx *ctx, int no_wait,
>  	struct userfaultfd_ctx *fork_nctx = NULL;
>  
>  	/* always take the fd_wqh lock before the fault_pending_wqh lock */
> -	spin_lock(&ctx->fd_wqh.lock);
> +	spin_lock_irq(&ctx->fd_wqh.lock);
>  	__add_wait_queue(&ctx->fd_wqh, &wait);
>  	for (;;) {
>  		set_current_state(TASK_INTERRUPTIBLE);
> @@ -1112,13 +1112,13 @@ static ssize_t userfaultfd_ctx_read(struct userfaultfd_ctx *ctx, int no_wait,
>  			ret = -EAGAIN;
>  			break;
>  		}
> -		spin_unlock(&ctx->fd_wqh.lock);
> +		spin_unlock_irq(&ctx->fd_wqh.lock);
>  		schedule();
> -		spin_lock(&ctx->fd_wqh.lock);
> +		spin_lock_irq(&ctx->fd_wqh.lock);
>  	}
>  	__remove_wait_queue(&ctx->fd_wqh, &wait);
>  	__set_current_state(TASK_RUNNING);
> -	spin_unlock(&ctx->fd_wqh.lock);
> +	spin_unlock_irq(&ctx->fd_wqh.lock);
>  
>  	if (!ret && msg->event == UFFD_EVENT_FORK) {
>  		ret = resolve_userfault_fork(ctx, fork_nctx, msg);
> 

Reviewed-by: Andrea Arcangeli <aarcange@...hat.com>

This is lock inversion with userfaultfd_poll that takes the fq_wqh
after the irqsafe aio lock. And the aio lock can be taken from softirq
(so potentially for interrupts) leading to a potential lock inversion
deadlock.

I suggest to add a comment about the above in the code before the
first spin_lock_irq to explain why it needs to be _irq or it's not
obvious.

c430d1e848ff1240d126e79780f3c26208b8aed9 was just a false positive
instead.

I didn't comment on c430d1e848ff1240d126e79780f3c26208b8aed9 because I
was too busy with the speculative execution issues at the time and it
was just fine to drop the microoptimization, but while at it can we
look in how to add a spin_acquire or find a way to teach lockdep in
another way, so it's happy even if we restore the microoptimization?

If we do that, in addition we should also initialize the
ctx->fault_wqh spinlock to locked in the same patch (a spin_lock
during uffd ctx creation will do) to be sure nobody takes it as
further robustness feature against future modification (it gets more
self documenting too that it's not supposed to be taken and the
fault_pending_wq.lock has to be taken instead).

Thanks,
Andrea

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ