lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Wed, 10 Jan 2024 09:03:30 +0000
From: David Laight <David.Laight@...LAB.COM>
To: 'Stephen Rothwell' <sfr@...b.auug.org.au>, Linus Torvalds
	<torvalds@...ux-foundation.org>
CC: Jiri Slaby <jirislaby@...il.com>, "linux-kernel@...r.kernel.org"
	<linux-kernel@...r.kernel.org>, Andy Shevchenko
	<andriy.shevchenko@...ux.intel.com>, Andrew Morton
	<akpm@...ux-foundation.org>, "Matthew Wilcox (Oracle)" <willy@...radead.org>,
	Christoph Hellwig <hch@...radead.org>, "Jason A. Donenfeld" <Jason@...c4.com>
Subject: RE: [PATCH next v4 0/5] minmax: Relax type checks in min() and max().

From: Stephen Rothwell
> Sent: 10 January 2024 06:18
> 
> Hi Linus,
> 
> On Mon, 8 Jan 2024 13:11:12 -0800 Linus Torvalds <torvalds@...ux-foundationorg> wrote:
> >
> > Whee.
> 
> Yeah.
> 
> > On my machine, that patch makes an "allmodconfig" build go from
> >
> >     10:41 elapsed
> >
> > to
> >
> >      8:46 elapsed
> >
> > so that min/max type checking is almost 20% of the build time.
> >
> > Yeah, I think we need to get rid of it.
> >
> > Can somebody else confirm similar time differences? Or is it just me?
> 
> I was hopeful, but:
> 
> no patch:
> 
> $ /usr/bin/time make ARCH=x86_64 CROSS_COMPILE=x86_64-linux-gnu- -j140 -O -s
> 102460.07user 3710.56system 13:29.05elapsed 13122%CPU (0avgtext+0avgdata 4023168maxresident)k
> 304inputs+7917056outputs (1998673major+120730959minor)pagefaults 0swaps
> 
> with patch:
> 
> $ /usr/bin/time make ARCH=x86_64 CROSS_COMPILE=x86_64-linux-gnu- -j140 -O -s
> 99775.75user 3684.45system 13:12.89elapsed 13048%CPU (0avgtext+0avgdata 2217536maxresident)k
> 64inputs+7890304outputs (2104371major+119837267minor)pagefaults 0swaps

That looks like 2500 in 100000 (user) or about 2.5%
I did some rebuilds just changing minmax.h and got just over 1%
for changing __types_ok() to be 1.

I did try a few other things, got some marginal improvements.
But I'm not trying to compile the code with 4 nested calls.

One of the things that does explode it somewhat is the
'return constant for constant' path needed to avoid VLA.
That generates two copies of the expansion.
A separate define for that would help a bit.
Doesn't matter much until you get nested min/max they will hurt.
The other slight annoyance is an extra __builtin_choose_expr()
needed for pointer types - because (void *)1 isn't constant.

min3() was mentioned, but that seems to be a nested expansion.
It would need to be more like clamp() to get any benefit.
(And maybe removing the const-for-const option.)

	David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ