linux-kernel - Re: [PATCH v8 05/10] filemap: cap PTE range to be created to allowed zero fill in folio_map

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20240701233924.GG612460@frogsfrogsfrogs>
Date: Mon, 1 Jul 2024 16:39:24 -0700
From: "Darrick J. Wong" <djwong@...nel.org>
To: "Pankaj Raghav (Samsung)" <kernel@...kajraghav.com>
Cc: david@...morbit.com, willy@...radead.org, chandan.babu@...cle.com,
	brauner@...nel.org, akpm@...ux-foundation.org,
	linux-kernel@...r.kernel.org, yang@...amperecomputing.com,
	linux-mm@...ck.org, john.g.garry@...cle.com,
	linux-fsdevel@...r.kernel.org, hare@...e.de, p.raghav@...sung.com,
	mcgrof@...nel.org, gost.dev@...sung.com, cl@...amperecomputing.com,
	linux-xfs@...r.kernel.org, hch@....de, Zi Yan <zi.yan@...t.com>
Subject: Re: [PATCH v8 05/10] filemap: cap PTE range to be created to allowed
 zero fill in folio_map_range()

On Tue, Jun 25, 2024 at 11:44:15AM +0000, Pankaj Raghav (Samsung) wrote:
> From: Pankaj Raghav <p.raghav@...sung.com>
> 
> Usually the page cache does not extend beyond the size of the inode,
> therefore, no PTEs are created for folios that extend beyond the size.
> 
> But with LBS support, we might extend page cache beyond the size of the
> inode as we need to guarantee folios of minimum order. While doing a
> read, do_fault_around() can create PTEs for pages that lie beyond the
> EOF leading to incorrect error return when accessing a page beyond the
> mapped file.
> 
> Cap the PTE range to be created for the page cache up to the end of
> file(EOF) in filemap_map_pages() so that return error codes are consistent
> with POSIX[1] for LBS configurations.
> 
> generic/749(currently in xfstest-dev patches-in-queue branch [0]) has
> been created to trigger this edge case. This also fixes generic/749 for
> tmpfs with huge=always on systems with 4k base page size.
> 
> [0] https://lore.kernel.org/all/20240615002935.1033031-3-mcgrof@kernel.org/
> [1](from mmap(2))  SIGBUS
>     Attempted access to a page of the buffer that lies beyond the end
>     of the mapped file.  For an explanation of the treatment  of  the
>     bytes  in  the  page that corresponds to the end of a mapped file
>     that is not a multiple of the page size, see NOTES.
> 
> Signed-off-by: Luis Chamberlain <mcgrof@...nel.org>
> Signed-off-by: Pankaj Raghav <p.raghav@...sung.com>
> Reviewed-by: Hannes Reinecke <hare@...e.de>
> Reviewed-by: Matthew Wilcox (Oracle) <willy@...radead.org>

Heh, another fun mmap wart!
Reviewed-by: Darrick J. Wong <djwong@...nel.org>

--D

> ---
>  mm/filemap.c | 6 +++++-
>  1 file changed, 5 insertions(+), 1 deletion(-)
> 
> diff --git a/mm/filemap.c b/mm/filemap.c
> index 8eafbd4a4d0c..56ff1d936aa8 100644
> --- a/mm/filemap.c
> +++ b/mm/filemap.c
> @@ -3612,7 +3612,7 @@ vm_fault_t filemap_map_pages(struct vm_fault *vmf,
>  	struct vm_area_struct *vma = vmf->vma;
>  	struct file *file = vma->vm_file;
>  	struct address_space *mapping = file->f_mapping;
> -	pgoff_t last_pgoff = start_pgoff;
> +	pgoff_t file_end, last_pgoff = start_pgoff;
>  	unsigned long addr;
>  	XA_STATE(xas, &mapping->i_pages, start_pgoff);
>  	struct folio *folio;
> @@ -3638,6 +3638,10 @@ vm_fault_t filemap_map_pages(struct vm_fault *vmf,
>  		goto out;
>  	}
>  
> +	file_end = DIV_ROUND_UP(i_size_read(mapping->host), PAGE_SIZE) - 1;
> +	if (end_pgoff > file_end)
> +		end_pgoff = file_end;
> +
>  	folio_type = mm_counter_file(folio);
>  	do {
>  		unsigned long end;
> -- 
> 2.44.1
> 
>