lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Fri, 19 Jan 2024 13:40:07 +0000
From: Matthew Wilcox <willy@...radead.org>
To: Peng Zhang <zhangpeng362@...wei.com>
Cc: linux-mm@...ck.org, linux-fsdevel@...r.kernel.org,
	netdev@...r.kernel.org, akpm@...ux-foundation.org,
	edumazet@...gle.com, davem@...emloft.net, dsahern@...nel.org,
	kuba@...nel.org, pabeni@...hat.com, arjunroy@...gle.com,
	wangkefeng.wang@...wei.com
Subject: Re: [RFC PATCH] filemap: add mapping_mapped check in
 filemap_unaccount_folio()

On Fri, Jan 19, 2024 at 05:20:24PM +0800, Peng Zhang wrote:
> Recently, we discovered a syzkaller issue that triggers
> VM_BUG_ON_FOLIO in filemap_unaccount_folio() with CONFIG_DEBUG_VM
> enabled, or bad page without CONFIG_DEBUG_VM.
> 
> The specific scenarios are as follows:
> (1) mmap: Use socket fd to create a TCP VMA.
> (2) open(O_CREAT) + fallocate + sendfile: Read the ext4 file and create
> the page cache. The mapping of the page cache is ext4 inode->i_mapping.
> Send the ext4 page cache to the socket fd through sendfile.
> (3) getsockopt TCP_ZEROCOPY_RECEIVE: Receive the ext4 page cache and use
> vm_insert_pages() to insert the ext4 page cache to the TCP VMA. In this
> case, mapcount changes from - 1 to 0. The page cache mapping is ext4
> inode->i_mapping, but the VMA of the page cache is the TCP VMA and
> folio->mapping->i_mmap is empty.

I think this is the bug.  We shouldn't be incrementing the mapcount
in this scenario.  Assuming we want to support doing this at all and
we don't want to include something like ...

	if (folio->mapping) {
		if (folio->mapping != vma->vm_file->f_mapping)
			return -EINVAL;
		if (page_to_pgoff(page) != linear_page_index(vma, address))
			return -EINVAL;
	}

But maybe there's a reason for networking needing to map pages in this
scenario?

> (4) open(O_TRUNC): Deletes the ext4 page cache. In this case, the page
> cache is still in the xarray tree of mapping->i_pages and these page
> cache should also be deleted. However, folio->mapping->i_mmap is empty.
> Therefore, truncate_cleanup_folio()->unmap_mapping_folio() can't unmap
> i_mmap tree. In filemap_unaccount_folio(), the mapcount of the folio is
> 0, causing BUG ON.
> 
> Syz log that can be used to reproduce the issue:
> r3 = socket$inet_tcp(0x2, 0x1, 0x0)
> mmap(&(0x7f0000ff9000/0x4000)=nil, 0x4000, 0x0, 0x12, r3, 0x0)
> r4 = socket$inet_tcp(0x2, 0x1, 0x0)
> bind$inet(r4, &(0x7f0000000000)={0x2, 0x4e24, @multicast1}, 0x10)
> connect$inet(r4, &(0x7f00000006c0)={0x2, 0x4e24, @empty}, 0x10)
> r5 = openat$dir(0xffffffffffffff9c, &(0x7f00000000c0)='./file0\x00',
> 0x181e42, 0x0)
> fallocate(r5, 0x0, 0x0, 0x85b8)
> sendfile(r4, r5, 0x0, 0x8ba0)
> getsockopt$inet_tcp_TCP_ZEROCOPY_RECEIVE(r4, 0x6, 0x23,
> &(0x7f00000001c0)={&(0x7f0000ffb000/0x3000)=nil, 0x3000, 0x0, 0x0, 0x0,
> 0x0, 0x0, 0x0, 0x0}, &(0x7f0000000440)=0x40)
> r6 = openat$dir(0xffffffffffffff9c, &(0x7f00000000c0)='./file0\x00',
> 0x181e42, 0x0)
> 
> In the current TCP zerocopy scenario, folio will be released normally .
> When the process exits, if the page cache is truncated before the
> process exits, BUG ON or Bad page occurs, which does not meet the
> expectation.
> To fix this issue, the mapping_mapped() check is added to
> filemap_unaccount_folio(). In addition, to reduce the impact on
> performance, no lock is added when mapping_mapped() is checked.

NAK this patch, you're just preventing the assertion from firing.
I think there's a deeper problem here.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ