lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <ZlnMfSJcm5k6Dg_e@infradead.org>
Date: Fri, 31 May 2024 06:11:25 -0700
From: Christoph Hellwig <hch@...radead.org>
To: Zhang Yi <yi.zhang@...weicloud.com>
Cc: linux-xfs@...r.kernel.org, linux-fsdevel@...r.kernel.org,
	linux-kernel@...r.kernel.org, djwong@...nel.org, hch@...radead.org,
	brauner@...nel.org, david@...morbit.com, chandanbabu@...nel.org,
	jack@...e.cz, willy@...radead.org, yi.zhang@...wei.com,
	chengzhihao1@...wei.com, yukuai3@...wei.com
Subject: Re: [RFC PATCH v4 1/8] iomap: zeroing needs to be pagecache aware

On Wed, May 29, 2024 at 05:51:59PM +0800, Zhang Yi wrote:
> XXX: how do we detect a iomap containing a cow mapping over a hole
> in iomap_zero_iter()? The XFS code implies this case also needs to
> zero the page cache if there is data present, so trigger for page
> cache lookup only in iomap_zero_iter() needs to handle this case as
> well.

If there is no data in the page cache and either a whole or unwritten
extent it really should not matter what is in the COW fork, a there
obviously isn't any data we could zero.

If there is data in the page cache for something that is marked as
a hole in the srcmap, but we have data in the COW fork due to
COW extsize preallocation we'd need to zero it, but as the
xfs iomap ops don't return a separate srcmap for that case we
should be fine.  Or am I missing something?

> + * Note: when zeroing unwritten extents, we might have data in the page cache
> + * over an unwritten extent. In this case, we want to do a pure lookup on the
> + * page cache and not create a new folio as we don't need to perform zeroing on
> + * unwritten extents if there is no cached data over the given range.
>   */
>  struct folio *iomap_get_folio(struct iomap_iter *iter, loff_t pos, size_t len)
>  {
>  	fgf_t fgp = FGP_WRITEBEGIN | FGP_NOFS;
>  
> +	if (iter->flags & IOMAP_ZERO) {
> +		const struct iomap *srcmap = iomap_iter_srcmap(iter);
> +
> +		if (srcmap->type == IOMAP_UNWRITTEN)
> +			fgp &= ~FGP_CREAT;
> +	}

Nit:  The comment would probably stand out a little better if it was
right next to the IOMAP_ZERO conditional instead of above the
function.

> +		if (status) {
> +			if (status == -ENOENT) {
> +				/*
> +				 * Unwritten extents need to have page cache
> +				 * lookups done to determine if they have data
> +				 * over them that needs zeroing. If there is no
> +				 * data, we'll get -ENOENT returned here, so we
> +				 * can just skip over this index.
> +				 */
> +				WARN_ON_ONCE(srcmap->type != IOMAP_UNWRITTEN);

I'd return -EIO if the WARN_ON triggers.

> +loop_continue:

While I'm no strange to gotos for loop control something trips me
up about jumping to the end of the loop.  Here is what I could come
up with instead.  Not arguing it's objectively better, but I somehow
like it a little better:

diff --git a/fs/iomap/buffered-io.c b/fs/iomap/buffered-io.c
index 700b22d6807783..81378f7cd8d7ff 100644
--- a/fs/iomap/buffered-io.c
+++ b/fs/iomap/buffered-io.c
@@ -1412,49 +1412,56 @@ static loff_t iomap_zero_iter(struct iomap_iter *iter, bool *did_zero)
 		bool ret;
 
 		status = iomap_write_begin(iter, pos, bytes, &folio);
-		if (status) {
-			if (status == -ENOENT) {
-				/*
-				 * Unwritten extents need to have page cache
-				 * lookups done to determine if they have data
-				 * over them that needs zeroing. If there is no
-				 * data, we'll get -ENOENT returned here, so we
-				 * can just skip over this index.
-				 */
-				WARN_ON_ONCE(srcmap->type != IOMAP_UNWRITTEN);
-				if (bytes > PAGE_SIZE - offset_in_page(pos))
-					bytes = PAGE_SIZE - offset_in_page(pos);
-				goto loop_continue;
-			}
+		if (status && status != -ENOENT)
 			return status;
-		}
-		if (iter->iomap.flags & IOMAP_F_STALE)
-			break;
 
-		offset = offset_in_folio(folio, pos);
-		if (bytes > folio_size(folio) - offset)
-			bytes = folio_size(folio) - offset;
+		if (status == -ENOENT) {
+			/*
+			 * If we end up here, we did not find a folio in the
+			 * page cache for an unwritten extent and thus can
+			 * skip over the range.
+			 */
+			if (WARN_ON_ONCE(srcmap->type != IOMAP_UNWRITTEN))
+				return -EIO;
 
-		/*
-		 * If the folio over an unwritten extent is clean (i.e. because
-		 * it has been read from), then it already contains zeros. Hence
-		 * we can just skip it.
-		 */
-		if (srcmap->type == IOMAP_UNWRITTEN &&
-		    !folio_test_dirty(folio)) {
-			folio_unlock(folio);
-			goto loop_continue;
+			/*
+			 * XXX: It would be nice if we could get the offset of
+			 * the next entry in the pagecache so that we don't have
+			 * to iterate one page at a time here.
+			 */
+			offset = offset_in_page(pos);
+			if (bytes > PAGE_SIZE - offset)
+				bytes = PAGE_SIZE - offset;
+		} else {
+			if (iter->iomap.flags & IOMAP_F_STALE)
+				break;
+
+			offset = offset_in_folio(folio, pos);
+			if (bytes > folio_size(folio) - offset)
+				bytes = folio_size(folio) - offset;
+		
+			/*
+			 * If the folio over an unwritten extent is clean (i.e.
+			 * because it has only been read from), then it already
+			 * contains zeros.  Hence we can just skip it.
+			 */
+			if (srcmap->type == IOMAP_UNWRITTEN &&
+			    !folio_test_dirty(folio)) {
+				folio_unlock(folio);
+				status = -ENOENT;
+			}
 		}
 
-		folio_zero_range(folio, offset, bytes);
-		folio_mark_accessed(folio);
+		if (status != -ENOENT) {
+			folio_zero_range(folio, offset, bytes);
+			folio_mark_accessed(folio);
 
-		ret = iomap_write_end(iter, pos, bytes, bytes, folio);
-		__iomap_put_folio(iter, pos, bytes, folio);
-		if (WARN_ON_ONCE(!ret))
-			return -EIO;
+			ret = iomap_write_end(iter, pos, bytes, bytes, folio);
+			__iomap_put_folio(iter, pos, bytes, folio);
+			if (WARN_ON_ONCE(!ret))
+				return -EIO;
+		}
 
-loop_continue:
 		pos += bytes;
 		length -= bytes;
 		written += bytes;

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ