lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Mon, 16 Jun 2014 16:54:32 -0700 (PDT)
From:	David Rientjes <rientjes@...gle.com>
To:	Andrew Morton <akpm@...ux-foundation.org>
cc:	Michal Nazarewicz <mina86@...a86.com>,
	Steven Rostedt <rostedt@...dmis.org>,
	Hagen Paul Pfeifer <hagen@...u.net>,
	linux-kernel@...r.kernel.org
Subject: Re: [PATCH] include: kernel.h: rewrite min3, max3 and clamp using
 min and max

On Mon, 16 Jun 2014, Andrew Morton wrote:

> On Mon, 16 Jun 2014 16:25:15 -0700 (PDT) David Rientjes <rientjes@...gle.com> wrote:
> 
> > On Mon, 16 Jun 2014, Andrew Morton wrote:
> > 
> > > > It appears that gcc is better at optimising a double call to min
> > > > and max rather than open coded min3 and max3.  This can be observed
> > > > here:
> > > > 
> > > > ...
> > > >
> > > > Furthermore, after ___make allmodconfig && make bzImage modules___ this is the
> > > > comparison of image and modules sizes:
> > > > 
> > > >     # Without this patch applied
> > > >     $ ls -l arch/x86/boot/bzImage **/*.ko |awk '{size += $5} END {print size}'
> > > >     350715800
> > > > 
> > > >     # With this patch applied
> > > >     $ ls -l arch/x86/boot/bzImage **/*.ko |awk '{size += $5} END {print size}'
> > > >     349856528
> > > 
> > > We saved nearly a megabyte by optimising min3(), max3() and clamp()? 
> > > 
> > > I'm counting a grand total of 182 callsites for those macros.  So the
> > > saving is 4700 bytes per invokation?  I don't believe it...
> > > 
> > 
> > I was checking just the instances of min3() in mm/ and gcc ends up 
> > inlining transfer_objects() in mm/slab.c as a result of this change and 
> > increases its text size:
> > 
> >    text	   data	    bss	    dec	    hex	filename
> >   28369	  21559	      4	  49932	   c30c	slab.o.before
> >   28399	  21559	      4	  49962	   c32a	slab.o.after
> 
> Maybe that's a good thing in disguise: gcc said "hey this thing is now
> small enough to inline it".
> 

On linux-next, allyesconfig has a 0.0001% savings as a result of the 
patch, but I'd be worried about the extra temp variable it allocates on 
the stack that is evident in the mm/slab.c disassembly unless all cases 
can be audited to show that we're not potentially deep.

   text	   data	    bss	    dec	    hex	filename
108573045	23488016	51580928	183641989	af22785	vmlinux.before
108572908	23488016	51580928	183641852	af226fc	vmlinux.after
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ