linux-kernel - [RFC v3 0/5] Transparent on-demand struct page initialization embedded in the buddy allocator

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-Id: <1376344480-156708-1-git-send-email-nzimmer@sgi.com>
Date:	Mon, 12 Aug 2013 16:54:35 -0500
From:	Nathan Zimmer <nzimmer@....com>
To:	hpa@...or.com, mingo@...nel.org
Cc:	linux-kernel@...r.kernel.org, linux-mm@...ck.org, holt@....com,
	nzimmer@....com, rob@...dley.net, travis@....com,
	daniel@...ascale-asia.com, akpm@...ux-foundation.org,
	gregkh@...uxfoundation.org, yinghai@...nel.org, mgorman@...e.de
Subject: [RFC v3 0/5] Transparent on-demand struct page initialization embedded in the buddy allocator

We are still restricting ourselves ourselves to 2MiB initialization.
This was initially to keep the patch set a little smaller and more clear.
However given how well it is currently performing I don't see a how much
better it could be with to 2GiB chunks.

As far as extra overhead. We incur an extra function call to
ensure_page_is_initialized but that is only really expensive when we find
uninitialized pages, otherwise it is a flag check once every PTRS_PER_PMD.
To get a better feel for this we ran two quick tests.

The first was simply timing some memhogs.
This showed no measurable difference so we made a more granular test.
We spawned N threads, start a timer, each thread mallocs totalmem/N then each
thread writes to its memory to induce page faults, stop the timer.
In this case it each thread had just under 4GB of ram to fault in.
This showed a measureable difference in the page faulting.
The baseline took an average of 2.68 seconds, the new version took an
average of 2.75 seconds.  Which is .07s slower or 2.6%.
Are there some other tests I should run?

With this patch, we did boot a 16TiB machine.
The two main areas that benefit from this patch is free_all_bootmem and
memmap_init_zone.  Without the patches it took 407 seconds and 1151 seconds
respectively.  With the patches it took 13 and 39 seconds respectively.
This is a total savings of 1506 seconds (25 minutes).
These times were acquired using a modified version of script which record the
time in uSecs at the beginning of each line of output.

Overall I am fairly happy with the patch set at the moment.  It improves boot
times without noticeable runtime overhead.
I am, as always, open for suggestions.

v2: included the Yinghai's suggestion to not set the reserved bit until later.

v3: Corrected my first attempt at moving the reserved bit.
__expand_page_initialization should only be called by ensure_pages_are_initialized

Nathan Zimmer (1):
  Only set page reserved in the memblock region

Robin Holt (4):
  memblock: Introduce a for_each_reserved_mem_region iterator.
  Have __free_pages_memory() free in larger chunks.
  Move page initialization into a separate function.
  Sparse initialization of struct page array.

 include/linux/memblock.h   |  18 +++++
 include/linux/mm.h         |   2 +
 include/linux/page-flags.h |   5 +-
 mm/memblock.c              |  32 ++++++++
 mm/mm_init.c               |   2 +-
 mm/nobootmem.c             |  28 +++----
 mm/page_alloc.c            | 198 ++++++++++++++++++++++++++++++++++++---------
 7 files changed, 229 insertions(+), 56 deletions(-)

-- 
1.8.2.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/