[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <c2c601dba823$bb2fdf00$318f9d00$@samsung.com>
Date: Tue, 8 Apr 2025 10:15:35 +0900
From: "Sungjong Seo" <sj1557.seo@...sung.com>
To: "'Anthony Iliopoulos'" <ailiop@...e.com>, "'Namjae Jeon'"
<linkinjeon@...nel.org>, "'Yuezhang Mo'" <yuezhang.mo@...y.com>
Cc: <linux-fsdevel@...r.kernel.org>, <linux-kernel@...r.kernel.org>,
<sjdev.seo@...il.com>, <cpgs@...sung.com>, <sj1557.seo@...sung.com>
Subject: RE: [PATCH] exfat: enable request merging for dir readahead
Hi, Anthony
> Directory listings that need to access the inode metadata (e.g. via
> statx to obtain the file types) of large filesystems with lots of
> metadata that aren't yet in dcache, will take a long time due to the
> directory readahead submitting one io request at a time which although
> targeting sequential disk sectors (up to EXFAT_MAX_RA_SIZE) are not
> merged at the block layer.
>
> Add plugging around sb_breadahead so that the requests can be batched
> and submitted jointly to the block layer where they can be merged by the
> io schedulers, instead of having each request individually submitted to
> the hardware queues.
>
> This significantly improves the throughput of directory listings as it
> also minimizes the number of io completions and related handling from
> the device driver side.
Good approach. However, this attempt was in the past Samsung code,
and there was a problem that the latency of directory-related operations
became longer when ra_count is large (maybe, MAX_RA_SIZE).
In the most recent code, blk_flush_plug is being done in units of
pages as follows.
```
blk_start_plug(&plug);
for (i = 0; i < ra_count; i++) {
if (i && !(i & (sects_per_page - 1)))
blk_flush_plug(&plug, false);
sb_breadahead(sb, sec + i);
}
blk_finish_plug(&plug);
```
However, since blk_flush_plug is not exported, it can no longer be used in
module build. It seems that blk_flush_plug needs to be exported or
improved to repeat blk_start_plug and blk_finish_plug in units of pages.
After changing to plug by page unit, could you also compare the throughput?
Thanks
>
> Signed-off-by: Anthony Iliopoulos <ailiop@...e.com>
> ---
> fs/exfat/dir.c | 3 +++
> 1 file changed, 3 insertions(+)
>
> diff --git a/fs/exfat/dir.c b/fs/exfat/dir.c
> index 3103b932b674..a46ab2690b4d 100644
> --- a/fs/exfat/dir.c
> +++ b/fs/exfat/dir.c
> @@ -621,6 +621,7 @@ static int exfat_dir_readahead(struct super_block *sb,
> sector_t sec)
> {
> struct exfat_sb_info *sbi = EXFAT_SB(sb);
> struct buffer_head *bh;
> + struct blk_plug plug;
> unsigned int max_ra_count = EXFAT_MAX_RA_SIZE >> sb-
> >s_blocksize_bits;
> unsigned int page_ra_count = PAGE_SIZE >> sb->s_blocksize_bits;
> unsigned int adj_ra_count = max(sbi->sect_per_clus, page_ra_count);
> @@ -644,8 +645,10 @@ static int exfat_dir_readahead(struct super_block
*sb,
> sector_t sec)
> if (!bh || !buffer_uptodate(bh)) {
> unsigned int i;
>
> + blk_start_plug(&plug);
> for (i = 0; i < ra_count; i++)
> sb_breadahead(sb, (sector_t)(sec + i));
> + blk_finish_plug(&plug);
> }
> brelse(bh);
> return 0;
> --
> 2.49.0
Powered by blists - more mailing lists