[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <202405031110.6F47982593@keescook>
Date: Fri, 3 May 2024 11:26:32 -0700
From: Kees Cook <keescook@...omium.org>
To: Bui Quang Minh <minhquangbui99@...il.com>,
Al Viro <viro@...iv.linux.org.uk>,
Christian Brauner <brauner@...nel.org>
Cc: syzbot <syzbot+045b454ab35fd82a35fb@...kaller.appspotmail.com>,
axboe@...nel.dk, brauner@...nel.org, io-uring@...r.kernel.org,
jack@...e.cz, linux-fsdevel@...r.kernel.org,
linux-kernel@...r.kernel.org, syzkaller-bugs@...glegroups.com,
viro@...iv.linux.org.uk, Sumit Semwal <sumit.semwal@...aro.org>,
Christian König <christian.koenig@....com>,
linux-media@...r.kernel.org, dri-devel@...ts.freedesktop.org,
linaro-mm-sig@...ts.linaro.org, Laura Abbott <laura@...bott.name>
Subject: get_file() unsafe under epoll (was Re: [syzbot] [fs?] [io-uring?]
general protection fault in __ep_remove)
On Fri, May 03, 2024 at 06:54:22PM +0700, Bui Quang Minh wrote:
> [...]
> Root cause:
> AFAIK, eventpoll (epoll) does not increase the registered file's reference.
> To ensure the safety, when the registered file is deallocated in __fput,
> eventpoll_release is called to unregister the file from epoll. When calling
> poll on epoll, epoll will loop through registered files and call vfs_poll on
> these files. In the file's poll, file is guaranteed to be alive, however, as
> epoll does not increase the registered file's reference, the file may be
> dying
> and it's not safe the get the file for later use. And dma_buf_poll violates
> this. In the dma_buf_poll, it tries to get_file to use in the callback. This
> leads to a race where the dmabuf->file can be fput twice.
>
> Here is the race occurs in the above proof-of-concept
>
> close(dmabuf->file)
> __fput_sync (f_count == 1, last ref)
> f_count-- (f_count == 0 now)
> __fput
> epoll_wait
> vfs_poll(dmabuf->file)
> get_file(dmabuf->file)(f_count == 1)
> eventpoll_release
> dmabuf->file deallocation
> fput(dmabuf->file) (f_count == 1)
> f_count--
> dmabuf->file deallocation
>
> I am not familiar with the dma_buf so I don't know the proper fix for the
> issue. About the rule that don't get the file for later use in poll callback
> of
> file, I wonder if it is there when only select/poll exist or just after
> epoll
> appears.
>
> I hope the analysis helps us to fix the issue.
Thanks for doing this analysis! I suspect at least a start of a fix
would be this:
diff --git a/drivers/dma-buf/dma-buf.c b/drivers/dma-buf/dma-buf.c
index 8fe5aa67b167..15e8f74ee0f2 100644
--- a/drivers/dma-buf/dma-buf.c
+++ b/drivers/dma-buf/dma-buf.c
@@ -267,9 +267,8 @@ static __poll_t dma_buf_poll(struct file *file, poll_table *poll)
if (events & EPOLLOUT) {
/* Paired with fput in dma_buf_poll_cb */
- get_file(dmabuf->file);
-
- if (!dma_buf_poll_add_cb(resv, true, dcb))
+ if (!atomic_long_inc_not_zero(&dmabuf->file) &&
+ !dma_buf_poll_add_cb(resv, true, dcb))
/* No callback queued, wake up any other waiters */
dma_buf_poll_cb(NULL, &dcb->cb);
else
@@ -290,9 +289,8 @@ static __poll_t dma_buf_poll(struct file *file, poll_table *poll)
if (events & EPOLLIN) {
/* Paired with fput in dma_buf_poll_cb */
- get_file(dmabuf->file);
-
- if (!dma_buf_poll_add_cb(resv, false, dcb))
+ if (!atomic_long_inc_not_zero(&dmabuf->file) &&
+ !dma_buf_poll_add_cb(resv, false, dcb))
/* No callback queued, wake up any other waiters */
dma_buf_poll_cb(NULL, &dcb->cb);
else
But this ends up leaving "active" non-zero, and at close time it runs
into:
BUG_ON(dmabuf->cb_in.active || dmabuf->cb_out.active);
But the bottom line is that get_file() is unsafe to use in some places,
one of which appears to be in the poll handler. There are maybe some
other fragile places too, like in drivers/gpu/drm/vmwgfx/ttm_object.c:
static bool __must_check get_dma_buf_unless_doomed(struct dma_buf *dmabuf)
{
return atomic_long_inc_not_zero(&dmabuf->file->f_count) != 0L;
}
Which I also note involves a dmabuf...
Due to this issue I've proposed fixing get_file() to detect pathological states:
https://lore.kernel.org/lkml/20240502222252.work.690-kees@kernel.org/
But that has run into some push-back. I'm hoping that seeing this epoll
example will help illustrate what needs fixing a little better.
I think the best current proposal is to just WARN sooner instead of a
full refcount_t implementation:
diff --git a/include/linux/fs.h b/include/linux/fs.h
index 8dfd53b52744..e09107d0a3d6 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -1040,7 +1040,8 @@ struct file_handle {
static inline struct file *get_file(struct file *f)
{
- atomic_long_inc(&f->f_count);
+ long prior = atomic_long_fetch_inc_relaxed(&f->f_count);
+ WARN_ONCE(!prior, "struct file::f_count incremented from zero; use-after-free condition present!\n");
return f;
}
What's the right way to deal with the dmabuf situation? (And I suspect
it applies to get_dma_buf_unless_doomed() as well...)
-Kees
--
Kees Cook
Powered by blists - more mailing lists