lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20150423163039.GB2449@suse.de>
Date:	Thu, 23 Apr 2015 17:30:39 +0100
From:	Mel Gorman <mgorman@...e.de>
To:	Daniel J Blueman <daniel@...ascale.com>
Cc:	Linux-MM <linux-mm@...ck.org>, Nathan Zimmer <nzimmer@....com>,
	Dave Hansen <dave.hansen@...el.com>,
	Waiman Long <waiman.long@...com>,
	Scott Norton <scott.norton@...com>,
	Andrew Morton <akpm@...ux-foundation.org>,
	LKML <linux-kernel@...r.kernel.org>,
	'Steffen Persvold' <sp@...ascale.com>
Subject: Re: [PATCH 0/13] Parallel struct page initialisation v3

On Thu, Apr 23, 2015 at 11:53:57PM +0800, Daniel J Blueman wrote:
> On Thu, Apr 23, 2015 at 6:33 PM, Mel Gorman <mgorman@...e.de> wrote:
> >The big change here is an adjustment to the topology_init path
> >that caused
> >soft lockups on Waiman and Daniel Blue had reported it was an
> >expensive
> >function.
> >
> >Changelog since v2
> >o Reduce overhead of topology_init
> >o Remove boot-time kernel parameter to enable/disable
> >o Enable on UMA
> >
> >Changelog since v1
> >o Always initialise low zones
> >o Typo corrections
> >o Rename parallel mem init to parallel struct page init
> >o Rebase to 4.0
> []
> 
> Splendid work! On this 256c setup, topology_init now takes 185ms.
> 
> This brings the kernel boot time down to 324s [1].

Good stuff. Am I correct in thinking that the vanilla kernel takes 732s?

> It turns out that
> one memset is responsible for most of the time setting up the the
> PUDs and PMDs; adapting memset to using non-temporal writes [3]
> avoids generating RMW cycles, bringing boot time down to 186s [2].
> 
> If this is a possibility, I can split this patch and map other
> arch's memset_nocache to memset, or change the callsite as
> preferred; comments welcome.
> 

In general, I see no problem with the patch and that it would be useful
going in before or after this series. I would suggest you splt this into
three patches. The first that is an asm-generic alias of memset_nocache
to memset with documentation saying it's optional for an architecture to
implement. The second would be your implementation for x86 that needs to
go to the x86 maintainers. The third would then be the memblock.c change.

Thanks.

-- 
Mel Gorman
SUSE Labs
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ