[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <Za6SD48Zf0CXriLm@casper.infradead.org>
Date: Mon, 22 Jan 2024 16:04:31 +0000
From: Matthew Wilcox <willy@...radead.org>
To: "zhangpeng (AS)" <zhangpeng362@...wei.com>
Cc: linux-mm@...ck.org, linux-fsdevel@...r.kernel.org,
netdev@...r.kernel.org, akpm@...ux-foundation.org,
edumazet@...gle.com, davem@...emloft.net, dsahern@...nel.org,
kuba@...nel.org, pabeni@...hat.com, arjunroy@...gle.com,
wangkefeng.wang@...wei.com
Subject: SECURITY PROBLEM: Any user can crash the kernel with TCP ZEROCOPY
I'm disappointed to have no reaction from netdev so far. Let's see if a
more exciting subject line evinces some interest.
On Sat, Jan 20, 2024 at 02:46:49PM +0800, zhangpeng (AS) wrote:
> On 2024/1/19 21:40, Matthew Wilcox wrote:
>
> > On Fri, Jan 19, 2024 at 05:20:24PM +0800, Peng Zhang wrote:
> > > Recently, we discovered a syzkaller issue that triggers
> > > VM_BUG_ON_FOLIO in filemap_unaccount_folio() with CONFIG_DEBUG_VM
> > > enabled, or bad page without CONFIG_DEBUG_VM.
> > >
> > > The specific scenarios are as follows:
> > > (1) mmap: Use socket fd to create a TCP VMA.
> > > (2) open(O_CREAT) + fallocate + sendfile: Read the ext4 file and create
> > > the page cache. The mapping of the page cache is ext4 inode->i_mapping.
> > > Send the ext4 page cache to the socket fd through sendfile.
> > > (3) getsockopt TCP_ZEROCOPY_RECEIVE: Receive the ext4 page cache and use
> > > vm_insert_pages() to insert the ext4 page cache to the TCP VMA. In this
> > > case, mapcount changes from - 1 to 0. The page cache mapping is ext4
> > > inode->i_mapping, but the VMA of the page cache is the TCP VMA and
> > > folio->mapping->i_mmap is empty.
> > I think this is the bug. We shouldn't be incrementing the mapcount
> > in this scenario. Assuming we want to support doing this at all and
> > we don't want to include something like ...
> >
> > if (folio->mapping) {
> > if (folio->mapping != vma->vm_file->f_mapping)
> > return -EINVAL;
> > if (page_to_pgoff(page) != linear_page_index(vma, address))
> > return -EINVAL;
> > }
> >
> > But maybe there's a reason for networking needing to map pages in this
> > scenario?
>
> Agreed, and I'm also curious why.
>
> > > (4) open(O_TRUNC): Deletes the ext4 page cache. In this case, the page
> > > cache is still in the xarray tree of mapping->i_pages and these page
> > > cache should also be deleted. However, folio->mapping->i_mmap is empty.
> > > Therefore, truncate_cleanup_folio()->unmap_mapping_folio() can't unmap
> > > i_mmap tree. In filemap_unaccount_folio(), the mapcount of the folio is
> > > 0, causing BUG ON.
> > >
> > > Syz log that can be used to reproduce the issue:
> > > r3 = socket$inet_tcp(0x2, 0x1, 0x0)
> > > mmap(&(0x7f0000ff9000/0x4000)=nil, 0x4000, 0x0, 0x12, r3, 0x0)
> > > r4 = socket$inet_tcp(0x2, 0x1, 0x0)
> > > bind$inet(r4, &(0x7f0000000000)={0x2, 0x4e24, @multicast1}, 0x10)
> > > connect$inet(r4, &(0x7f00000006c0)={0x2, 0x4e24, @empty}, 0x10)
> > > r5 = openat$dir(0xffffffffffffff9c, &(0x7f00000000c0)='./file0\x00',
> > > 0x181e42, 0x0)
> > > fallocate(r5, 0x0, 0x0, 0x85b8)
> > > sendfile(r4, r5, 0x0, 0x8ba0)
> > > getsockopt$inet_tcp_TCP_ZEROCOPY_RECEIVE(r4, 0x6, 0x23,
> > > &(0x7f00000001c0)={&(0x7f0000ffb000/0x3000)=nil, 0x3000, 0x0, 0x0, 0x0,
> > > 0x0, 0x0, 0x0, 0x0}, &(0x7f0000000440)=0x40)
> > > r6 = openat$dir(0xffffffffffffff9c, &(0x7f00000000c0)='./file0\x00',
> > > 0x181e42, 0x0)
> > >
> > > In the current TCP zerocopy scenario, folio will be released normally .
> > > When the process exits, if the page cache is truncated before the
> > > process exits, BUG ON or Bad page occurs, which does not meet the
> > > expectation.
> > > To fix this issue, the mapping_mapped() check is added to
> > > filemap_unaccount_folio(). In addition, to reduce the impact on
> > > performance, no lock is added when mapping_mapped() is checked.
> > NAK this patch, you're just preventing the assertion from firing.
> > I think there's a deeper problem here.
>
> --
> Best Regards,
> Peng
>
>
Powered by blists - more mailing lists