lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <10bce97b-1a17-462f-aac3-af5b7847f670@linux.alibaba.com>
Date: Fri, 18 Jul 2025 10:45:53 +0800
From: Gao Xiang <hsiangkao@...ux.alibaba.com>
To: Chao Yu <chao@...nel.org>, xiang@...nel.org
Cc: linux-erofs@...ts.ozlabs.org, linux-kernel@...r.kernel.org,
 Yue Hu <zbestahu@...il.com>, Jeffle Xu <jefflexu@...ux.alibaba.com>,
 Sandeep Dhavale <dhavale@...gle.com>, Hongbo Li <lihongbo22@...wei.com>
Subject: Re: [PATCH v3] erofs: support to readahead dirent blocks in
 erofs_readdir()



On 2025/7/18 09:52, Chao Yu wrote:
> Hi Xiang,
> 
> On 2025/7/17 16:26, Gao Xiang wrote:
>> Hi Chao,
>>
>> On 2025/7/14 17:39, Chao Yu wrote:
>>> This patch supports to readahead more blocks in erofs_readdir(), it can
>>> enhance readdir performance in large direcotry.
>>>
>>> readdir test in a large directory which contains 12000 sub-files.
>>>
>>>         files_per_second
>>> Before:        926385.54
>>> After:        2380435.562
>>>
>>> Meanwhile, let's introduces a new sysfs entry to control readahead
>>> bytes to provide more flexible policy for readahead of readdir().
>>> - location: /sys/fs/erofs/<disk>/dir_ra_bytes
>>> - default value: 16384
>>> - disable readahead: set the value to 0
>>>
>>> Signed-off-by: Chao Yu <chao@...nel.org>
>>> ---
>>> v3:
>>> - add EROFS prefix for macro
>>> - update new sysfs interface to 1) use bytes instead of pages
>>> 2) remove upper boundary limitation
>>> - fix bug of pageidx calculation
>>>   Documentation/ABI/testing/sysfs-fs-erofs |  8 ++++++++
>>>   fs/erofs/dir.c                           | 13 +++++++++++++
>>>   fs/erofs/internal.h                      |  4 ++++
>>>   fs/erofs/super.c                         |  2 ++
>>>   fs/erofs/sysfs.c                         |  2 ++
>>>   5 files changed, 29 insertions(+)
>>>
>>> diff --git a/Documentation/ABI/testing/sysfs-fs-erofs b/Documentation/ ABI/testing/sysfs-fs-erofs
>>> index bf3b6299c15e..85fa56ca092c 100644
>>> --- a/Documentation/ABI/testing/sysfs-fs-erofs
>>> +++ b/Documentation/ABI/testing/sysfs-fs-erofs
>>> @@ -35,3 +35,11 @@ Description:    Used to set or show hardware accelerators in effect
>>>           and multiple accelerators are separated by '\n'.
>>>           Supported accelerator(s): qat_deflate.
>>>           Disable all accelerators with an empty string (echo > accel).
>>> +
>>> +What:        /sys/fs/erofs/<disk>/dir_ra_bytes
>>> +Date:        July 2025
>>> +Contact:    "Chao Yu" <chao@...nel.org>
>>> +Description:    Used to set or show readahead bytes during readdir(), by
>>> +        default the value is 16384.
>>> +
>>> +        - 0: disable readahead.
>>> diff --git a/fs/erofs/dir.c b/fs/erofs/dir.c
>>> index 3e4b38bec0aa..950d6b0046f4 100644
>>> --- a/fs/erofs/dir.c
>>> +++ b/fs/erofs/dir.c
>>> @@ -47,8 +47,10 @@ static int erofs_readdir(struct file *f, struct dir_context *ctx)
>>>       struct inode *dir = file_inode(f);
>>>       struct erofs_buf buf = __EROFS_BUF_INITIALIZER;
>>>       struct super_block *sb = dir->i_sb;
>>> +    struct file_ra_state *ra = &f->f_ra;
>>>       unsigned long bsz = sb->s_blocksize;
>>>       unsigned int ofs = erofs_blkoff(sb, ctx->pos);
>>> +    unsigned long nr_pages = DIV_ROUND_UP_POW2(dir->i_size, PAGE_SIZE);
>>
>>      pgoff_t ra_pages = PAGE_ALIGN(EROFS_SB(dir)->dir_ra_bytes);
> 
> Do you mean?
> 
> pgoff_t ra_pages = PAGE_ALIGN(EROFS_SB(dir)->dir_ra_bytes) >> PAGE_SHIFT?

That seems more complicated, sigh, I think the original
DIV_ROUND_UP_POW2() is better.

> 
>>
>>>       int err = 0;
>>>       bool initial = true;
>>> @@ -63,6 +65,17 @@ static int erofs_readdir(struct file *f, struct dir_context *ctx)
>>>               break;
>>>           }
>>> +        /* readahead blocks to enhance performance in large directory */
>>> +        if (EROFS_I_SB(dir)->dir_ra_bytes) {
>  > >          if (ra_pages) {
>>
>>> +            unsigned long idx = DIV_ROUND_UP(ctx->pos, PAGE_SIZE);
>>> +            pgoff_t ra_pages = DIV_ROUND_UP(
>>> +                EROFS_I_SB(dir)->dir_ra_bytes, PAGE_SIZE);
> 
> I put calculation here because if the value is zero, we don't need unnecessary calculation above, anyway, will update as you suggested, but let me know if you have any other concerns. :)

If only shift calculatiion is needed, I guess it
should not have much overhead, like:

		if (ra_pages) {
			pgoff_t idx = DIV_ROUND_UP_POW2(ctx->pos, PAGE_SIZE);
			pgoff_t pages = min(nr_pages - idx, ra_pages);

			...

Thanks,
Gao Xiang

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ