[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <d9c73675-1060-fd8b-958f-50793dca4db4@nathanm.com>
Date: Thu, 27 Oct 2022 03:29:07 +0000
From: Nathan Moinvaziri <nathan@...hanm.com>
To: Andy Shevchenko <andy.shevchenko@...il.com>
CC: Andy Shevchenko <andy@...nel.org>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH] lib/string.c: Improve strcasecmp speed by not lowering if
chars match
On 10/25/2022 12:19 PM, Andy Shevchenko wrote:
> Looks promising, but may I suggest a few things:
> 1) have you considered the word-at-a-time use (like strscpy() does)?
Only briefly at the beginning of the function to check for an identical
comparison and the added check hurt performance for strings that were
not identical.
On 10/25/2022 12:19 PM, Andy Shevchenko wrote:
> 2) instead of using tolower() on both sides, have you considered
> (with the above in mind) to use XOR over words and if they are not 0,
> check if the result is one of possible combinations of 0x20 and then
> by excluding the non-letters from the range you may find the
> difference?
I'm not sure what you mean about the possible combinations of the space
character. I have not investigated this method.
...
According to my previous findings the check for c1 != c2 does perform
better for strings that are at least 25% or more the same. I was able to
get even more performance out of it by changing tolower() to use a
different hash table than the one used for the is*() functions. By using
a pre-generated hash table for both islower() and isupper() it is
possible to remove the branch where ever those functions are used,
including in strcasecmp. This method I've seen employed in the Android
code base and also in cURL. Using it would add additional 2x256 bytes to
the code size for the tables.
I've put together a Quick Benchmark that shows the comparison between
the different methods:
https://quick-bench.com/q/l5DkYQO-CcMxQUu5MjZiqZ8M-Y0
Nathan
Powered by blists - more mailing lists