[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAJMQK-ig4QiTP-JqyuSLzWLrOCTKh4tb6D45ksqwMk2bvjkNsQ@mail.gmail.com>
Date: Fri, 20 May 2022 15:38:03 +0800
From: Hsin-Yi Wang <hsinyi@...omium.org>
To: Phillip Lougher <phillip@...ashfs.org.uk>
Cc: Matthew Wilcox <willy@...radead.org>,
Xiongwei Song <Xiongwei.Song@...driver.com>,
Zheng Liang <zhengliang6@...wei.com>,
Zhang Yi <yi.zhang@...wei.com>, Hou Tao <houtao1@...wei.com>,
Miao Xie <miaoxie@...wei.com>,
Andrew Morton <akpm@...ux-foundation.org>,
"linux-mm @ kvack . org" <linux-mm@...ck.org>,
"squashfs-devel @ lists . sourceforge . net"
<squashfs-devel@...ts.sourceforge.net>,
linux-kernel@...r.kernel.org
Subject: Re: [PATCH v2 3/3] squashfs: implement readahead
On Fri, May 20, 2022 at 11:02 AM Phillip Lougher
<phillip@...ashfs.org.uk> wrote:
>
> On 19/05/2022 09:09, Hsin-Yi Wang wrote:
> > On Tue, May 17, 2022 at 4:28 PM Hsin-Yi Wang <hsinyi@...omium.org> wrote:
> >>
> >> Implement readahead callback for squashfs. It will read datablocks
> >> which cover pages in readahead request. For a few cases it will
> >> not mark page as uptodate, including:
> >> - file end is 0.
> >> - zero filled blocks.
> >> - current batch of pages isn't in the same datablock or not enough in a
> >> datablock.
> >> Otherwise pages will be marked as uptodate. The unhandled pages will be
> >> updated by readpage later.
> >>
> >> Suggested-by: Matthew Wilcox <willy@...radead.org>
> >> Signed-off-by: Hsin-Yi Wang <hsinyi@...omium.org>
> >> Reported-by: Matthew Wilcox <willy@...radead.org>
> >> Reported-by: Phillip Lougher <phillip@...ashfs.org.uk>
> >> Reported-by: Xiongwei Song <sxwjean@...il.com>
> >> ---
> >> v1->v2: remove unused check on readahead_expand().
> >> v1: https://lore.kernel.org/lkml/20220516105100.1412740-3-hsinyi@chromium.org/
> >> ---
> >
> > Hi Phillip and Matthew,
> >
> > Regarding the performance issue of this patch, I saw a possible
> > performance gain if we only read the first block instead of reading
> > until nr_pages == 0.
> >
> > To be more clear, apply the following diff (Please ignore the skipping
> > of nr_pages check first. This is a demonstration of "only read and
> > update the first block per readahead call"):
> >
> > diff --git a/fs/squashfs/file.c b/fs/squashfs/file.c
> > index aad6823f0615..c52f7c4a7cfe 100644
> > --- a/fs/squashfs/file.c
> > +++ b/fs/squashfs/file.c
> > @@ -524,10 +524,8 @@ static void squashfs_readahead(struct
> > readahead_control *ractl)
> > if (!actor)
> > goto out;
> >
> > - for (;;) {
> > + {
> > nr_pages = __readahead_batch(ractl, pages, max_pages);
> > - if (!nr_pages)
> > - break;
> >
> > if (readahead_pos(ractl) >= i_size_read(inode) ||
> > nr_pages < max_pages)
> >
> >
> > All the performance numbers:
> > 1. original: 39s
> > 2. revert "mm: put readahead pages in cache earlier": 2.8s
> > 3. v2 of this patch: 2.7s
> > 4. v2 of this patch and apply the diff: 1.8s
> >
> > In my testing data, normally it reads and updates 1~2 blocks per
> > readahead call. The change might not make sense since the performance
> > improvement may only happen in certain cases.
> > What do you think? Or is the performance of the current patch
> > considered reasonable?
>
> It entirely depends on where the speed improvement comes from.
>
> From experience, the speed improvement is probably worthwhile,
> and probably isn't gained at the expense of worse performance
> on other work-loads.
>
> But this is a guestimate, based on the fact timings 2 and 3
> (2.8s v 2.7s) are almost identical. Which implies the v2
> patch isn't now doing any more work than the previous
> baseline before the "mm: put readahead pages in cache earlier"
> patch (*).
>
> As such the speed improvement must be coming from increased
> parallelism. Such as moving from serially reading the
> readahead blocks to parallel reading.
>
Thanks for the idea. I checked this by offlining other cores until
only one core exists. Removing loops still results in less time.
But after counting the #traces lines in squashfs_read_data():
If we remove the for loop (timings 4), the logs are less: 2.3K lines,
while v2 (timings 3) has 3.7K (other timings are also around 3.7K), so
removing loop doesn't look right.
I think v2 should be fine considering the slightly to none regression
compared to before.
Hi Matthew, what do you think? Do you have other comments? If not,
should I send a v3 to change Xiongwei Song's email address or can you
help modify it?
Thanks
> But, without looking at any trace output, that is just a
> guestimate.
>
> Phillip
>
> (*) multiply decompressing the same blocks, which
> is the cause of the performance regression.
> >
> > Thanks.
> >
> > testing env:
> > - arm64 on kernel 5.10
> > - data: ~ 300K pack file contains some android files
> >
> >> fs/squashfs/file.c | 77 +++++++++++++++++++++++++++++++++++++++++++++-
> >> 1 file changed, 76 insertions(+), 1 deletion(-)
> >>
> >> diff --git a/fs/squashfs/file.c b/fs/squashfs/file.c
> >> index a8e495d8eb86..e10a55c5b1eb 100644
> >> --- a/fs/squashfs/file.c
> >> +++ b/fs/squashfs/file.c
> >> @@ -39,6 +39,7 @@
> >> #include "squashfs_fs_sb.h"
> >> #include "squashfs_fs_i.h"
> >> #include "squashfs.h"
> >> +#include "page_actor.h"
> >>
> >> /*
> >> * Locate cache slot in range [offset, index] for specified inode. If
> >> @@ -495,7 +496,81 @@ static int squashfs_read_folio(struct file *file, struct folio *folio)
> >> return 0;
> >> }
> >>
> >> +static void squashfs_readahead(struct readahead_control *ractl)
> >> +{
> >> + struct inode *inode = ractl->mapping->host;
> >> + struct squashfs_sb_info *msblk = inode->i_sb->s_fs_info;
> >> + size_t mask = (1UL << msblk->block_log) - 1;
> >> + size_t shift = msblk->block_log - PAGE_SHIFT;
> >> + loff_t start = readahead_pos(ractl) &~ mask;
> >> + size_t len = readahead_length(ractl) + readahead_pos(ractl) - start;
> >> + struct squashfs_page_actor *actor;
> >> + unsigned int nr_pages = 0;
> >> + struct page **pages;
> >> + u64 block = 0;
> >> + int bsize, res, i, index;
> >> + int file_end = i_size_read(inode) >> msblk->block_log;
> >> + unsigned int max_pages = 1UL << shift;
> >> +
> >> + readahead_expand(ractl, start, (len | mask) + 1);
> >> +
> >> + if (file_end == 0)
> >> + return;
> >> +
> >> + pages = kmalloc_array(max_pages, sizeof(void *), GFP_KERNEL);
> >> + if (!pages)
> >> + return;
> >> +
> >> + actor = squashfs_page_actor_init_special(pages, max_pages, 0);
> >> + if (!actor)
> >> + goto out;
> >> +
> >> + for (;;) {
> >> + nr_pages = __readahead_batch(ractl, pages, max_pages);
> >> + if (!nr_pages)
> >> + break;
> >> +
> >> + if (readahead_pos(ractl) >= i_size_read(inode) ||
> >> + nr_pages < max_pages)
> >> + goto skip_pages;
> >> +
> >> + index = pages[0]->index >> shift;
> >> + if ((pages[nr_pages - 1]->index >> shift) != index)
> >> + goto skip_pages;
> >> +
> >> + bsize = read_blocklist(inode, index, &block);
> >> + if (bsize == 0)
> >> + goto skip_pages;
> >> +
> >> + res = squashfs_read_data(inode->i_sb, block, bsize, NULL,
> >> + actor);
> >> +
> >> + if (res >= 0)
> >> + for (i = 0; i < nr_pages; i++)
> >> + SetPageUptodate(pages[i]);
> >> +
> >> + for (i = 0; i < nr_pages; i++) {
> >> + unlock_page(pages[i]);
> >> + put_page(pages[i]);
> >> + }
> >> + }
> >> +
> >> + kfree(actor);
> >> + kfree(pages);
> >> + return;
> >> +
> >> +skip_pages:
> >> + for (i = 0; i < nr_pages; i++) {
> >> + unlock_page(pages[i]);
> >> + put_page(pages[i]);
> >> + }
> >> +
> >> + kfree(actor);
> >> +out:
> >> + kfree(pages);
> >> +}
> >>
> >> const struct address_space_operations squashfs_aops = {
> >> - .read_folio = squashfs_read_folio
> >> + .read_folio = squashfs_read_folio,
> >> + .readahead = squashfs_readahead
> >> };
> >> --
> >> 2.36.0.550.gb090851708-goog
> >>
>
Powered by blists - more mailing lists