lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20171018085711.GC1753@intel.com>
Date:   Wed, 18 Oct 2017 16:57:11 +0800
From:   Aaron Lu <aaron.lu@...el.com>
To:     Vlastimil Babka <vbabka@...e.cz>
Cc:     "akpm@...ux-foundation.org" <akpm@...ux-foundation.org>,
        "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
        "tim.c.chen@...ux.intel.com" <tim.c.chen@...ux.intel.com>,
        "khandual@...ux.vnet.ibm.com" <khandual@...ux.vnet.ibm.com>,
        "linux-mm@...ck.org" <linux-mm@...ck.org>,
        "ak@...ux.intel.com" <ak@...ux.intel.com>,
        "Wang, Kemi" <kemi.wang@...el.com>,
        "Hansen, Dave" <dave.hansen@...el.com>,
        "Huang, Ying" <ying.huang@...el.com>
Subject: Re: [PATCH] mm/page_alloc: make sure __rmqueue() etc. always inline

On Wed, Oct 18, 2017 at 08:28:56AM +0200, Vlastimil Babka wrote:
> On 10/18/2017 03:53 AM, Lu, Aaron wrote:
> > On Tue, 2017-10-17 at 13:32 +0200, Vlastimil Babka wrote:
> >> With gcc 7.2.1:
> >>> ./scripts/bloat-o-meter base.o mm/page_alloc.o
> >>
> >> add/remove: 1/2 grow/shrink: 2/0 up/down: 2493/-1649 (844)
> > 
> > Nice, it clearly showed 844 bytes bloat.
> > 
> >> function                                     old     new   delta
> >> get_page_from_freelist                      2898    4937   +2039
> >> steal_suitable_fallback                        -     365    +365
> >> find_suitable_fallback                        31     120     +89
> >> find_suitable_fallback.part                  115       -    -115
> >> __rmqueue                                   1534       -   -1534
> 
> It also shows that steal_suitable_fallback() is no longer inlined. Which
> is fine, because that should ideally be rarely executed.

Ah right, so this script is really good for analysing inline changes.

> 
> >>
> >>> [aaron@...onlu obj]$ size */*/vmlinux
> >>>    text    data     bss     dec       hex     filename
> >>> 10342757   5903208 17723392 33969357  20654cd gcc-4.9.4/base/vmlinux
> >>> 10342757   5903208 17723392 33969357  20654cd gcc-4.9.4/head/vmlinux
> >>> 10332448   5836608 17715200 33884256  2050860 gcc-5.5.0/base/vmlinux
> >>> 10332448   5836608 17715200 33884256  2050860 gcc-5.5.0/head/vmlinux
> >>> 10094546   5836696 17715200 33646442  201676a gcc-6.4.0/base/vmlinux
> >>> 10094546   5836696 17715200 33646442  201676a gcc-6.4.0/head/vmlinux
> >>> 10018775   5828732 17715200 33562707  2002053 gcc-7.2.0/base/vmlinux
> >>> 10018775   5828732 17715200 33562707  2002053 gcc-7.2.0/head/vmlinux
> >>>
> >>> Text size for vmlinux has no change though, probably due to function
> >>> alignment.
> >>
> >> Yep that's useless to show. These differences do add up though, until
> >> they eventually cross the alignment boundary.
> > 
> > Agreed.
> > But you know, it is the hot path, the performance improvement might be
> > worth it.
> 
> I'd agree, so you can add
> 
> Acked-by: Vlastimil Babka <vbabka@...e.cz>

Thanks!

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ