[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20240504-irrsinn-sinnlich-83cf0890c7dc@brauner>
Date: Sat, 4 May 2024 11:59:09 +0200
From: Christian Brauner <brauner@...nel.org>
To: Kees Cook <keescook@...omium.org>
Cc: Jens Axboe <axboe@...nel.dk>,
Bui Quang Minh <minhquangbui99@...il.com>, Al Viro <viro@...iv.linux.org.uk>,
syzbot <syzbot+045b454ab35fd82a35fb@...kaller.appspotmail.com>, io-uring@...r.kernel.org, jack@...e.cz,
linux-fsdevel@...r.kernel.org, linux-kernel@...r.kernel.org, syzkaller-bugs@...glegroups.com,
Sumit Semwal <sumit.semwal@...aro.org>, Christian König <christian.koenig@....com>,
linux-media@...r.kernel.org, dri-devel@...ts.freedesktop.org, linaro-mm-sig@...ts.linaro.org,
Laura Abbott <laura@...bott.name>
Subject: Re: get_file() unsafe under epoll (was Re: [syzbot] [fs?]
[io-uring?] general protection fault in __ep_remove)
On Fri, May 03, 2024 at 12:59:52PM -0700, Kees Cook wrote:
> On Fri, May 03, 2024 at 01:35:09PM -0600, Jens Axboe wrote:
> > On 5/3/24 1:22 PM, Kees Cook wrote:
> > > On Fri, May 03, 2024 at 12:49:11PM -0600, Jens Axboe wrote:
> > >> On 5/3/24 12:26 PM, Kees Cook wrote:
> > >>> Thanks for doing this analysis! I suspect at least a start of a fix
> > >>> would be this:
> > >>>
> > >>> diff --git a/drivers/dma-buf/dma-buf.c b/drivers/dma-buf/dma-buf.c
> > >>> index 8fe5aa67b167..15e8f74ee0f2 100644
> > >>> --- a/drivers/dma-buf/dma-buf.c
> > >>> +++ b/drivers/dma-buf/dma-buf.c
> > >>> @@ -267,9 +267,8 @@ static __poll_t dma_buf_poll(struct file *file, poll_table *poll)
> > >>>
> > >>> if (events & EPOLLOUT) {
> > >>> /* Paired with fput in dma_buf_poll_cb */
> > >>> - get_file(dmabuf->file);
> > >>> -
> > >>> - if (!dma_buf_poll_add_cb(resv, true, dcb))
> > >>> + if (!atomic_long_inc_not_zero(&dmabuf->file) &&
> > >>> + !dma_buf_poll_add_cb(resv, true, dcb))
> > >>> /* No callback queued, wake up any other waiters */
> > >>
> > >> Don't think this is sane at all. I'm assuming you meant:
> > >>
> > >> atomic_long_inc_not_zero(&dmabuf->file->f_count);
> > >
> > > Oops, yes, sorry. I was typed from memory instead of copy/paste.
> >
> > Figured :-)
> >
> > >> but won't fly as you're not under RCU in the first place. And what
> > >> protects it from being long gone before you attempt this anyway? This is
> > >> sane way to attempt to fix it, it's completely opposite of what sane ref
> > >> handling should look like.
> > >>
> > >> Not sure what the best fix is here, seems like dma-buf should hold an
> > >> actual reference to the file upfront rather than just stash a pointer
> > >> and then later _hope_ that it can just grab a reference. That seems
> > >> pretty horrible, and the real source of the issue.
> > >
> > > AFAICT, epoll just doesn't hold any references at all. It depends,
> > > I think, on eventpoll_release() (really eventpoll_release_file())
> > > synchronizing with epoll_wait() (but I don't see how this happens, and
> > > the race seems to be against ep_item_poll() ...?)
> > >
> > > I'm really confused about how eventpoll manages the lifetime of polled
> > > fds.
> >
> > epoll doesn't hold any references, and it's got some ugly callback to
> > deal with that. It's not ideal, nor pretty, but that's how it currently
> > works. See eventpoll_release() and how it's called. This means that
> > epoll itself is supposedly safe from the file going away, even though it
> > doesn't hold a reference to it.
>
> Right -- what remains unclear to me is how struct file lifetime is
> expected to work in the struct file_operations::poll callbacks. Because
> using get_file() there looks clearly unsafe...
If you're in ->poll() you're holding the epoll mutex and
eventpoll_release_file() needs to acquire ep->mtx as well. So if you're
in ->poll() then you know that eventpoll_release_file() can't progress
and therefore eventpoll_release() can't make progress. So
f_op->release() won't be able to be called as it happens after
eventpoll_release() in __fput(). But f_count being able to go to zero is
expected.
Powered by blists - more mailing lists