linux-kernel - Re: [PATCH] lib/string.c: Improve strcasecmp speed by not lowering if chars match

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <d9c73675-1060-fd8b-958f-50793dca4db4@nathanm.com>
Date:   Thu, 27 Oct 2022 03:29:07 +0000
From:   Nathan Moinvaziri <nathan@...hanm.com>
To:     Andy Shevchenko <andy.shevchenko@...il.com>
CC:     Andy Shevchenko <andy@...nel.org>,
        "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH] lib/string.c: Improve strcasecmp speed by not lowering if
 chars match

On 10/25/2022 12:19 PM, Andy Shevchenko wrote:
> Looks promising, but may I suggest a few things:
> 1) have you considered the word-at-a-time use (like strscpy() does)?

Only briefly at the beginning of the function to check for an identical 
comparison and the added check hurt performance for strings that were 
not identical.

On 10/25/2022 12:19 PM, Andy Shevchenko wrote:

> 2) instead of using tolower() on both sides,  have you considered
> (with the above in mind) to use XOR over words and if they are not 0,
> check if the result is one of possible combinations of 0x20 and then
> by excluding the non-letters from the range you may find the
> difference?

I'm not sure what you mean about the possible combinations of the space 
character. I have not investigated this method.

...

According to my previous findings the check for c1 != c2 does perform 
better for strings that are at least 25% or more the same. I was able to 
get even more performance out of it by changing tolower() to use a 
different hash table than the one used for the is*() functions. By using 
a pre-generated hash table for both islower() and isupper() it is 
possible to remove the branch where ever those functions are used, 
including in strcasecmp. This method I've seen employed in the Android 
code base and also in cURL. Using it would add additional 2x256 bytes to 
the code size for the tables.

I've put together a Quick Benchmark that shows the comparison between 
the different methods:

https://quick-bench.com/q/l5DkYQO-CcMxQUu5MjZiqZ8M-Y0

Nathan