linux-kernel - Re: [RFC] [PATCH 0/5 V2] Huge page backed user-space stacks

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20080731103137.GD1704@csn.ul.ie>
Date:	Thu, 31 Jul 2008 11:31:38 +0100
From:	Mel Gorman <mel@....ul.ie>
To:	Andrew Morton <akpm@...ux-foundation.org>
Cc:	ebmunson@...ibm.com, linux-mm@...ck.org,
	linux-kernel@...r.kernel.org, linuxppc-dev@...abs.org,
	libhugetlbfs-devel@...ts.sourceforge.net, abh@...y.com
Subject: Re: [RFC] [PATCH 0/5 V2] Huge page backed user-space stacks

On (30/07/08 13:07), Andrew Morton didst pronounce:
> On Wed, 30 Jul 2008 20:30:10 +0100
> Mel Gorman <mel@....ul.ie> wrote:
> 
> > With Erics patch and libhugetlbfs, we can automatically back text/data[1],
> > malloc[2] and stacks without source modification. Fairly soon, libhugetlbfs
> > will also be able to override shmget() to add SHM_HUGETLB. That should cover
> > a lot of the memory-intensive apps without source modification.
> 
> The weak link in all of this still might be the need to reserve
> hugepages and the unreliability of dynamically allocating them.
> 
> The dynamic allocation should be better nowadays, but I've lost track
> of how reliable it really is.  What's our status there?
> 

We are a lot more reliable than we were although exact quantification is
difficult because it's workload dependent. For a long time, I've been able
to test bits and pieces with hugepages by allocating the pool at the time
I needed it even after days of uptime. Previously this required a reboot.

I've also been able to use the dynamic hugepage pool resizing effectively
and we track how much it is succeeding and failing in /proc/vmstat (see the
htlb fields) to watch for problems. Between that and /proc/pagetypeinfo, I am
expecting to be able to identify availablilty problems. As an administrator
can now set a minimum pool size and the maximum size of the pool (nr_hugepages
and nr_overcommit_hugepages), the configuration difficulties should be relaxed.

If it is found that anti-fragmentation can be broken down and pool
resizing starts failing after X amount of time on Y workloads, there is
still the option of using movablecore=BiggestPoolSizeIWillEverNeed
and writing 1 to /proc/sys/vm/hugepages_treat_as_movable so the hugepage
pool can grow/shrink reliably there.

Overall, it's in pretty good shape.

To be fair, one snag is that that swap is almost required for pool
resizing to work as I never pushed to complete memory compaction
(http://lwn.net/Articles/238837/).  Hence, we depend on the workload to
have lots of filesystem-backed data for lumpy-reclaim to do its job, for
pool resizing to take place between batch jobs or for swap to be configured
even if it's just for the duration of a pool resize.

-- 
Mel Gorman
Part-time Phd Student                          Linux Technology Center
University of Limerick                         IBM Dublin Software Lab
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/