lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Thu, 27 Oct 2022 03:29:07 +0000
From:   Nathan Moinvaziri <nathan@...hanm.com>
To:     Andy Shevchenko <andy.shevchenko@...il.com>
CC:     Andy Shevchenko <andy@...nel.org>,
        "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH] lib/string.c: Improve strcasecmp speed by not lowering if
 chars match

On 10/25/2022 12:19 PM, Andy Shevchenko wrote:
> Looks promising, but may I suggest a few things:
> 1) have you considered the word-at-a-time use (like strscpy() does)?

Only briefly at the beginning of the function to check for an identical 
comparison and the added check hurt performance for strings that were 
not identical.

On 10/25/2022 12:19 PM, Andy Shevchenko wrote:

> 2) instead of using tolower() on both sides,  have you considered
> (with the above in mind) to use XOR over words and if they are not 0,
> check if the result is one of possible combinations of 0x20 and then
> by excluding the non-letters from the range you may find the
> difference?

I'm not sure what you mean about the possible combinations of the space 
character. I have not investigated this method.

...

According to my previous findings the check for c1 != c2 does perform 
better for strings that are at least 25% or more the same. I was able to 
get even more performance out of it by changing tolower() to use a 
different hash table than the one used for the is*() functions. By using 
a pre-generated hash table for both islower() and isupper() it is 
possible to remove the branch where ever those functions are used, 
including in strcasecmp. This method I've seen employed in the Android 
code base and also in cURL. Using it would add additional 2x256 bytes to 
the code size for the tables.

I've put together a Quick Benchmark that shows the comparison between 
the different methods:

https://quick-bench.com/q/l5DkYQO-CcMxQUu5MjZiqZ8M-Y0

Nathan



Powered by blists - more mailing lists