lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <BYAPR06MB557390C5F741300BDE2BAA8BD8319@BYAPR06MB5573.namprd06.prod.outlook.com>
Date:   Tue, 25 Oct 2022 17:53:49 +0000
From:   Nathan Moinvaziri <nathan@...hanm.com>
To:     Andy Shevchenko <andy@...nel.org>
CC:     "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>
Subject: RE: [PATCH] lib/string.c: Improve strcasecmp speed by not lowering if
 chars match

Hi Andy,

I appreciate your quick feedback!

I have done as you suggested and published my results this time using Google benchmark:
https://github.com/nmoinvaz/strcasecmp

After you review it, and if you still think the patch is worthwhile then I can fix the other problems you mentioned for the original patch. If you think it is not worth it, then I understand. 

Thanks again,
Nathan

-----Original Message-----
From: Andy Shevchenko <andy@...nel.org> 
Sent: Tuesday, October 25, 2022 2:04 AM
To: Nathan Moinvaziri <nathan@...hanm.com>
Cc: linux-kernel@...r.kernel.org
Subject: Re: [PATCH] lib/string.c: Improve strcasecmp speed by not lowering if chars match

On Tue, Oct 25, 2022 at 11:00:36AM +0300, Andy Shevchenko wrote:
> On Tue, Oct 25, 2022 at 4:46 AM Nathan Moinvaziri <nathan@...hanm.com> wrote:

...

> > When running tests using Quick Benchmark with two matching 256 
> > character strings these changes result in anywhere between ~6-9x speed improvement.
> >
> > * We use unsigned char instead of int similar to strncasecmp.
> > * We only subtract c1 - c2 when they are not equal.

...

> You tell us that this is more preformant, but have not provided the 
> numbers. Can we see those, please?

So, I have read carefully and see the reference to some QuickBenchmark I have no idea about. What I meant here is to have numbers provided by an (open
source) tool (maybe even in-kernel test case) that anybody can test on their machines. You also missed details about how you run, what the data set has been used, etc.

> Note, that you basically trash CPU cache lines when characters are not 
> equal, and before doing that you have a branching. I'm unsure that 
> your way is more performant than the original one.

--
With Best Regards,
Andy Shevchenko


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ