linux-kernel - Re: [PATCH v5 3/3] squashfs: implement readahead

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <fa555552-021e-cefe-4602-39dbc5ce3330@squashfs.org.uk>
Date:   Sat, 11 Jun 2022 06:23:42 +0100
From:   Phillip Lougher <phillip@...ashfs.org.uk>
To:     Hsin-Yi Wang <hsinyi@...omium.org>,
        Matthew Wilcox <willy@...radead.org>,
        Xiongwei Song <Xiongwei.Song@...driver.com>,
        Marek Szyprowski <m.szyprowski@...sung.com>,
        Andrew Morton <akpm@...ux-foundation.org>
Cc:     Zheng Liang <zhengliang6@...wei.com>,
        Zhang Yi <yi.zhang@...wei.com>, Hou Tao <houtao1@...wei.com>,
        Miao Xie <miaoxie@...wei.com>,
        "linux-mm @ kvack . org" <linux-mm@...ck.org>,
        "squashfs-devel @ lists . sourceforge . net" 
        <squashfs-devel@...ts.sourceforge.net>,
        linux-kernel@...r.kernel.org
Subject: Re: [PATCH v5 3/3] squashfs: implement readahead

On 06/06/2022 16:03, Hsin-Yi Wang wrote:
> Implement readahead callback for squashfs. It will read datablocks
> which cover pages in readahead request. For a few cases it will
> not mark page as uptodate, including:
> - file end is 0.
> - zero filled blocks.
> - current batch of pages isn't in the same datablock.
> - decompressor error.
> Otherwise pages will be marked as uptodate. The unhandled pages will be
> updated by readpage later.
> 

Hi Hsin-Yi,

I have reviewed, tested and instrumented the following patch.

There are a number of problems with the patch including
performance, unhandled issues, and bugs.

In this email I'll concentrate on the performance aspects.

The major change between this V5 patch and the previous patches
(V4 etc), is that it now handles the case where

+ nr_pages = __readahead_batch(ractl, pages, max_pages);

returns an "nr_pages" less than "max_pages".

What this means is that the readahead code has returned a set
of page cache pages which does not fully map the datablock to
be decompressed.

If this is passed to squashfs_read_data() using the current
"page actor" code, the decompression will fail on the missing
pages.

In recognition of that fact, your V5 patch falls back to using
the earlier intermediate buffer method, with
squashfs_get_datablock() returning a buffer, which is then memcopied
into the page cache pages.

This is currently what is also done in the existing
squashfs_readpage_block() function if the entire set of pages cannot
be obtained.

The problem with this fallback intermediate buffer is it is slow, both
due to the additional memcopies, but, more importantly because it
introduces contention on a single shared buffer.

I have long had the intention to fix this performance issue in
squashfs_readpage_block(), but, due it being a rare issue there, the
additional work has seemed to be nice but not essential.

The problem is we don't want the readahead code to be using this
slow method, because the scenario will probably happen much more
often, and for a performance improvement patch, falling back to
an old slow method isn't very useful.

So I have finally done the work to make the "page actor" code handle
missing pages.

This I have sent out in the following patch-set updating the
squashfs_readpage_block() function to use it.

https://lore.kernel.org/lkml/20220611032133.5743-1-phillip@squashfs.org.uk/

You can use this updated "page actor" code to eliminate the
"nr_pages < max_pages" special case in your patch.  With the benefit
that decompression is done directly into the page cache.

I have updated your patch to use the new functionality.  The diff
including a bug fix I have appended to this email.

Phillip

diff --git a/fs/squashfs/file.c b/fs/squashfs/file.c
index b86b2f9d9ae6..721d35ecfca9 100644
--- a/fs/squashfs/file.c
+++ b/fs/squashfs/file.c
@@ -519,10 +519,6 @@ static void squashfs_readahead(struct 
readahead_control *ractl)
  	if (!pages)
  		return;

-	actor = squashfs_page_actor_init_special(pages, max_pages, 0);
-	if (!actor)
-		goto out;
-
  	for (;;) {
  		pgoff_t index;
  		int res, bsize;
@@ -548,41 +544,21 @@ static void squashfs_readahead(struct 
readahead_control *ractl)
  		if (bsize == 0)
  			goto skip_pages;

-		if (nr_pages < max_pages) {
-			struct squashfs_cache_entry *buffer;
-			unsigned int block_mask = max_pages - 1;
-			int offset = pages[0]->index - (pages[0]->index & ~block_mask);
-
-			buffer = squashfs_get_datablock(inode->i_sb, block,
-							bsize);
-			if (buffer->error) {
-				squashfs_cache_put(buffer);
-				goto skip_pages;
-			}
-
-			expected -= offset * PAGE_SIZE;
-			for (i = 0; i < nr_pages && expected > 0; i++,
-						expected -= PAGE_SIZE, offset++) {
-				int avail = min_t(int, expected, PAGE_SIZE);
-
-				squashfs_fill_page(pages[i], buffer,
-						offset * PAGE_SIZE, avail);
-				unlock_page(pages[i]);
-			}
-
-			squashfs_cache_put(buffer);
-			continue;
-		}
+		actor = squashfs_page_actor_init_special(msblk, pages, nr_pages, 
expected);
+		if (!actor)
+			goto out;

  		res = squashfs_read_data(inode->i_sb, block, bsize, NULL,
  					 actor);

+		kfree(actor);
+
  		if (res == expected) {
  			int bytes;

-			/* Last page may have trailing bytes not filled */
+			/* Last page (if present) may have trailing bytes not filled */
  			bytes = res % PAGE_SIZE;
-			if (bytes) {
+			if (pages[nr_pages - 1]->index == file_end && bytes) {
  				void *pageaddr;

  				pageaddr = kmap_atomic(pages[nr_pages - 1]);
@@ -602,7 +578,6 @@ static void squashfs_readahead(struct 
readahead_control *ractl)
  		}
  	}

-	kfree(actor);
  	kfree(pages);
  	return;

@@ -612,7 +587,6 @@ static void squashfs_readahead(struct 
readahead_control *ractl)
  		put_page(pages[i]);
  	}

-	kfree(actor);
  out:
  	kfree(pages);
  }
-- 
2.34.1