[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20190304023618.GS2217@ZenIV.linux.org.uk>
Date: Mon, 4 Mar 2019 02:36:18 +0000
From: Al Viro <viro@...iv.linux.org.uk>
To: Linus Torvalds <torvalds@...ux-foundation.org>
Cc: Eric Dumazet <eric.dumazet@...il.com>,
David Miller <davem@...emloft.net>,
Jason Baron <jbaron@...mai.com>, kgraul@...ux.ibm.com,
ktkhai@...tuozzo.com, kyeongdon.kim@....com,
Linux List Kernel Mailing <linux-kernel@...r.kernel.org>,
Netdev <netdev@...r.kernel.org>, pabeni@...hat.com,
syzkaller-bugs@...glegroups.com, xiyou.wangcong@...il.com,
Christoph Hellwig <hch@....de>
Subject: Re: [PATCH] aio: prevent the final fput() in the middle of
vfs_poll() (Re: KASAN: use-after-free Read in unix_dgram_poll)
On Sun, Mar 03, 2019 at 02:23:33PM -0800, Linus Torvalds wrote:
OK, having dug through the archives, the reasons were not strong.
So that part is OK...
> @@ -1060,6 +1071,8 @@ static inline void iocb_put(struct aio_kiocb *iocb)
> {
> if (refcount_read(&iocb->ki_refcnt) == 0 ||
> refcount_dec_and_test(&iocb->ki_refcnt)) {
> + if (iocb->ki_filp)
> + fput(iocb->ki_filp);
> percpu_ref_put(&iocb->ki_ctx->reqs);
> kmem_cache_free(kiocb_cachep, iocb);
> }
That reminds me - refcount_t here is rather ridiculous; what we
have is
* everything other than aio_poll: ki_refcnt is 0 all along
* aio_poll: originally 0, then set to 2, then two iocb_put()
are done (either both synchronous to aio_poll(), or one sync and one
async).
That's not a refcount at all. It's a flag, set only for aio_poll ones.
And that test in iocb_put() is "if flag is set, clear it and bugger off".
What's worse, AFAICS the callers in aio_poll() are buggered (see below)
> static inline void aio_poll_complete(struct aio_kiocb *iocb, __poll_t mask)
> {
> - struct file *file = iocb->poll.file;
> -
> aio_complete(iocb, mangle_poll(mask), 0);
> - fput(file);
> }
No reasons to keep that function at all now...
> - if (unlikely(apt.error)) {
> - fput(req->file);
> + if (unlikely(apt.error))
> return apt.error;
> - }
>
> if (mask)
> aio_poll_complete(aiocb, mask);
Looking at that thing... How does it manage to avoid leaks
when we try to use it on e.g. /dev/tty, which has
poll_wait(file, &tty->read_wait, wait);
poll_wait(file, &tty->write_wait, wait);
in n_tty_poll()?
AFAICS, we'll succeed adding it to the first queue, then have
aio_poll_queue_proc() fail and set apt.error to -EINVAL.
Suppose we are looking for EPOLLIN and there's nothing ready
to read. We'll go
mask = vfs_poll(req->file, &apt.pt) & req->events;
mask is 0.
if (unlikely(!req->head)) {
nope - it's &tty->read_wait, not NULL
/* we did not manage to set up a waitqueue, done */
goto out;
}
spin_lock_irq(&ctx->ctx_lock);
spin_lock(&req->head->lock);
if (req->woken) {
nope - no wakeups so far
/* wake_up context handles the rest */
mask = 0;
apt.error = 0;
} else if (mask || apt.error) {
apt.error is non-zero
/* if we get an error or a mask we are done */
WARN_ON_ONCE(list_empty(&req->wait.entry));
list_del_init(&req->wait.entry);
OK, removed from queue
} else {
/* actually waiting for an event */
list_add_tail(&aiocb->ki_list, &ctx->active_reqs);
aiocb->ki_cancel = aio_poll_cancel;
}
spin_unlock(&req->head->lock);
spin_unlock_irq(&ctx->ctx_lock);
out:
if (unlikely(apt.error)) {
fput(req->file);
return apt.error;
... and we return -EINVAL to __io_submit_one(), where we hit
/*
* If ret is 0, we'd either done aio_complete() ourselves or have
* arranged for that to be done asynchronously. Anything non-zero
* means that we need to destroy req ourselves.
*/
if (ret)
goto out_put_req;
return 0;
out_put_req:
if (req->ki_eventfd)
eventfd_ctx_put(req->ki_eventfd);
iocb_put(req);
out_put_reqs_available:
put_reqs_available(ctx, 1);
return ret;
and all knowledge of req is lost. But we'd done only *one* iocb_put() in
that case, and ->ki_refcnt had been set to 2 by aio_poll(). How could it
avoid a leak? The same goes for "->poll() returns something without
bothering to call poll_wait()" case, actually...
IOW, I would rather have aio_poll() (along with your fput-a-bit-later change)
do this -
out:
if (mask && !apt.error)
aio_complete(aiocb, mangle_poll(mask), 0);
iocb_put(aiocb);
return apt.error;
Comments?
Powered by blists - more mailing lists