[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <2579f239.5415.1971fd2086a.Coremail.00107082@163.com>
Date: Fri, 30 May 2025 14:12:27 +0800 (CST)
From: "David Wang" <00107082@....com>
To: "Theodore Ts'o" <tytso@....edu>
Cc: adilger.kernel@...ger.ca, linux-ext4@...r.kernel.org,
linux-kernel@...r.kernel.org
Subject: Re: [RFC] ext4: use kmem_cache for short fname allocation in
readdir
At 2025-05-30 12:35:16, "Theodore Ts'o" <tytso@....edu> wrote:
>On Thu, May 29, 2025 at 10:42:56PM +0800, David Wang wrote:
>> When searching files, ext4_readdir would kzalloc() a fname
>> object for each entry. It would be faster if a dedicated
>> kmem_cache is used for fname.
>>
>> But fnames are of variable length.
>>
>> This patch suggests using kmem_cache for fname with short
>> length, and resorting to kzalloc when fname needs larger buffer.
>> Assuming long file names are not very common.
>>
>> Profiling when searching files in kernel code base, with following
>> command:
>> # perf record -g -e cpu-clock --freq=max bash -c \
>> "for i in {1..100}; do find ./linux -name notfoundatall > /dev/null; done"
>> And using sample counts as indicator of performance improvement.
>
>I would think a better indicator of performance improvement would be
>to measure the system time when running the find commands. (i.e.,
>either using getrusange with RUSAGE_CHILDREN or wait3 or wait4).
I did use `time` to compare system time when search files with find,
and I did see slight improvement.
The std deviation is quite high for the whole `find` process though.
>
>We're trading off some extra memory usage and code complexity with
>less CPU time because entries in the kmem_cache might be more TLB
>friendly. But this is only really going to be applicable if the
>directory is large enough such that the cycles spent in readdir is
>significant compared to the rest of the userspace program, *and* you
>are reading the directory multiple times (e.g., calling find on a
>directory hierarchy many, many times) such that the disk blocks are
>cahed and you don't need to read them from the storage device.
>Otherwise the I/O costs will completely dominate and swamp the
>marginal TLB cache savings.
Yes, the test was run with cache-hot.
But repeating search files is not uncommon practice, `find` would run with cache-hot
except the first round.
>
>Given that it's really rare for readdir() to be the bottleneck of many
>workloads, the question is, is it worth it?
That's the question I have been thinking about.
Beside marginal improvement for readdir(), I would argue with the impact on other parts in system
when searching files. Even with cache-code, searching large dir would involving high frequent of
malloc() for a short interval, This might have transient negative impact on others which also request malloc(), but with
low frequency. But I don't have a convincing examples for this, it's all theoretical .
Thanks
David
>
> - Ted
Powered by blists - more mailing lists