[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <87o89gw4yy.fsf@protonmail.com>
Date: Sun, 29 Aug 2021 17:11:56 +0000
From: "Caleb D.S. Brzezinski" <calebdsb@...tonmail.com>
To: Al Viro <viro@...iv.linux.org.uk>
Cc: hirofumi@...l.parknet.co.jp, linux-kernel@...r.kernel.org,
linux-fsdevel@...r.kernel.org
Subject: Re: [PATCH 2/3] fat: add the msdos_format_name() filename cache
Hi Al,
"Al Viro" <viro@...iv.linux.org.uk> writes:
> On Sun, Aug 29, 2021 at 02:25:29PM +0000, Caleb D.S. Brzezinski wrote:
>> Implement the main msdos_format_name() filename cache. If used as a
>> module, all memory allocated for the cache is freed when the module is
>> de-registered.
>>
>> Signed-off-by: Caleb D.S. Brzezinski <calebdsb@...tonmail.com>
>> ---
>> fs/fat/namei_msdos.c | 35 +++++++++++++++++++++++++++++++++++
>> 1 file changed, 35 insertions(+)
>>
>> diff --git a/fs/fat/namei_msdos.c b/fs/fat/namei_msdos.c
>> index 7561674b1..f9d4f63c3 100644
>> --- a/fs/fat/namei_msdos.c
>> +++ b/fs/fat/namei_msdos.c
>> @@ -124,6 +124,16 @@ static int msdos_format_name(const unsigned char *name, int len,
>> unsigned char *walk;
>> unsigned char c;
>> int space;
>> + u64 hash;
>> + struct msdos_name_node *node;
>> +
>> + /* check if the name is already in the cache */
>> +
>> + hash = msdos_fname_hash(name);
>> + if (find_fname_in_cache(res, hash))
>> + return 0;
> Huh? How could that possibly work, seeing that
> * your hash function only looks at the first 8 characters
My understanding was that the maximum length of the name considered when
passed to msdos_format_name() was eight characters; see:
while (walk - res < 8)
and
for (walk = res; len && walk - res < 8; walk++) {
If that's an incorrect understanding, then yes, it definitely wouldn't
work. A larger, more computationally intensive hash function would be
required, which would most likely cancel out the improved lookup from
the cache.
> * your find_fname_in_cache() assumes that hash collisions
> are impossible, which is... unlikely, considering the nature of
> that hash function
If the names are 8 character limited, then logically any name with the
exact same set of characters would "collide" into the same formatted
name. Again, if I misunderstood the constraints on the filenames, then
yes, this is unnecessary.
> Out of curiosity, how have you tested that thing?
I've used it on my own FAT32 drives for profiling, run it through
kmemleak, ksan, some stress tests, etc. for a few weeks. Like I said, I
benchmarked it and it shaved about 0.2ms of time off my most common use
case.
Thanks.
Caleb B.
--
"Come now, and let us reason together," Says the LORD
-- Isaiah 1:18a, NASB
Powered by blists - more mailing lists