[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <alpine.DEB.2.10.1503032145110.12253@chino.kir.corp.google.com>
Date: Tue, 3 Mar 2015 21:49:55 -0800 (PST)
From: David Rientjes <rientjes@...gle.com>
To: Mike Kravetz <mike.kravetz@...cle.com>
cc: linux-mm@...ck.org, linux-kernel@...r.kernel.org,
Andrew Morton <akpm@...ux-foundation.org>,
Davidlohr Bueso <dave@...olabs.net>,
Aneesh Kumar <aneesh.kumar@...ux.vnet.ibm.com>,
Joonsoo Kim <iamjoonsoo.kim@....com>
Subject: Re: [PATCH 0/4] hugetlbfs: optionally reserve all fs pages at mount
time
On Tue, 3 Mar 2015, Mike Kravetz wrote:
> hugetlbfs allocates huge pages from the global pool as needed. Even if
> the global pool contains a sufficient number pages for the filesystem
> size at mount time, those global pages could be grabbed for some other
> use. As a result, filesystem huge page allocations may fail due to lack
> of pages.
>
> Applications such as a database want to use huge pages for performance
> reasons. hugetlbfs filesystem semantics with ownership and modes work
> well to manage access to a pool of huge pages. However, the application
> would like some reasonable assurance that allocations will not fail due
> to a lack of huge pages. At application startup time, the application
> would like to configure itself to use a specific number of huge pages.
> Before starting, the application will can check to make sure that enough
> huge pages exist in the system global pools. What the application wants
> is exclusive use of a subpool of huge pages.
>
> Add a new hugetlbfs mount option 'reserved' to specify that the number
> of pages associated with the size of the filesystem will be reserved. If
> there are insufficient pages, the mount will fail. The reservation is
> maintained for the duration of the filesystem so that as pages are
> allocated and free'ed a sufficient number of pages remains reserved.
>
This functionality is somewhat limited because it's not possible to
reserve a subset of the size for a single mount point, it's either all or
nothing. It shouldn't be too difficult to just add a reserved=<value>
option where <value> is <= size. If it's done that way, you should be
able to omit size= entirely for unlimited hugepages but always ensure that
a low watermark of hugepages are reserved for the database.
> Comments from RFC addressed/incorporated
>
> Mike Kravetz (4):
> hugetlbfs: add reserved mount fields to subpool structure
> hugetlbfs: coordinate global and subpool reserve accounting
> hugetlbfs: accept subpool reserved option and setup accordingly
> hugetlbfs: document reserved mount option
>
> Documentation/vm/hugetlbpage.txt | 18 ++++++++------
> fs/hugetlbfs/inode.c | 15 ++++++++++--
> include/linux/hugetlb.h | 7 ++++++
> mm/hugetlb.c | 53 +++++++++++++++++++++++++++++++++-------
> 4 files changed, 75 insertions(+), 18 deletions(-)
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists