[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20161029191753.GU19539@ZenIV.linux.org.uk>
Date: Sat, 29 Oct 2016 20:17:53 +0100
From: Al Viro <viro@...IV.linux.org.uk>
To: Linus Torvalds <torvalds@...ux-foundation.org>
Cc: Christoph Hellwig <hch@....de>, Jan Kara <jack@...e.cz>,
Dmitry Monakhov <dmonakhov@...nvz.org>,
Jeff Moyer <jmoyer@...hat.com>,
linux-fsdevel <linux-fsdevel@...r.kernel.org>,
linux-aio@...ck.org,
Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
stable <stable@...r.kernel.org>
Subject: Re: [PATCH] aio: fix a user triggered use after free (and fix freeze
protection of aio writes)
On Sat, Oct 29, 2016 at 12:07:07PM -0700, Linus Torvalds wrote:
> On Sat, Oct 29, 2016 at 11:52 AM, Al Viro <viro@...iv.linux.org.uk> wrote:
> >
> > And that call can happen as soon as we return from __blockdev_direct_IO()
> > (even earlier, actually). As soon as that happens, the reference to
> > struct file we'd acquired in io_submit_one() is dropped. If descriptor
> > table had been shared, another thread might have already closed that sucker,
> > and fput() from aio_complete() would free struct file.
>
> But that's the point. We don't *do* anything like that any more. We
> now always do the final access from aio_complete(). So it doesn't
> matter if that is called asynchronously (very early) or not.
>
> That's the whole point of the patch. Exactly to do everything either
> *before* we even submit it (at which point no completion can happen),
> or doing it in aio_complete() which is guaranteed to be after the
> submission. No races, no use-after-free.
>
> What am I missing?
if (ret > 0)
ret = __generic_file_write_iter(iocb, from);
inode_unlock(inode);
in generic_file_write_iter() (inode might be gone here)
written = mapping->a_ops->direct_IO(iocb, &data);
/*
* Finally, try again to invalidate clean pages which might have been
* cached by non-direct readahead, or faulted in by get_user_pages()
* if the source of the write was an mmap'ed region of the file
* we're writing. Either one is a pretty crazy thing to do,
* so we don't support it 100%. If this invalidation
* fails, tough, the write still worked...
*/
if (mapping->nrpages) {
invalidate_inode_pages2_range(mapping,
pos >> PAGE_SHIFT, end);
in generic_file_direct_write() (mapping points into inode, which might be
freed)
ret = __blockdev_direct_IO(iocb, inode, target->bt_bdev, &data,
xfs_get_blocks_direct, NULL, NULL, 0);
if (ret >= 0) {
iocb->ki_pos += ret;
iov_iter_advance(to, ret);
}
xfs_rw_iunlock(ip, XFS_IOLOCK_SHARED);
in xfs_file_dio_aio_read() (ip points to xfs-private part of inode).
And that - just from a quick look. We *do* access inode between the call
of __blockdev_direct_IO() and return from ->write_iter(). What's more,
as soon as aio_complete() has happened, what's to stop close from another
thread + umount + rmmod unmapping that ->write_iter() completely?
AFAICS, the possibility of dropping the last reference to struct file
before ->write_iter() has returned is fundamentally broken. I might be
missing something subtle here, but...
Powered by blists - more mailing lists