[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <8f01845c-51dd-20a6-1d75-64f9de0ccb0b@linux.alibaba.com>
Date: Tue, 17 Apr 2018 14:51:17 -0700
From: Yang Shi <yang.shi@...ux.alibaba.com>
To: Andrew Morton <akpm@...ux-foundation.org>
Cc: viro@...iv.linux.org.uk, nyc@...omorphy.com,
mike.kravetz@...cle.com, kirill.shutemov@...ux.intel.com,
hughd@...gle.com, linux-fsdevel@...r.kernel.org,
linux-mm@...ck.org, linux-kernel@...r.kernel.org,
linux-man@...r.kernel.org, mtk.manpages@...il.com,
linux-api@...r.kernel.org
Subject: Re: [RFC PATCH] fs: introduce ST_HUGE flag and set it to tmpfs and
hugetlbfs
On 4/17/18 2:31 PM, Andrew Morton wrote:
> On Wed, 18 Apr 2018 05:08:13 +0800 Yang Shi <yang.shi@...ux.alibaba.com> wrote:
>
>> Since tmpfs THP was supported in 4.8, hugetlbfs is not the only
>> filesystem with huge page support anymore. tmpfs can use huge page via
>> THP when mounting by "huge=" mount option.
>>
>> When applications use huge page on hugetlbfs, it just need check the
>> filesystem magic number, but it is not enough for tmpfs. So, introduce
>> ST_HUGE flag to statfs if super block has SB_HUGE set which indicates
>> huge page is supported on the specific filesystem.
>>
>> Some applications could benefit from this change, for example QEMU.
>> When use mmap file as guest VM backend memory, QEMU typically mmap the
>> file size plus one extra page. If the file is on hugetlbfs the extra
>> page is huge page size (i.e. 2MB), but it is still 4KB on tmpfs even
>> though THP is enabled. tmpfs THP requires VMA is huge page aligned, so
>> if 4KB page is used THP will not be used at all. The below /proc/meminfo
>> fragment shows the THP use of QEMU with 4K page:
>>
>> ShmemHugePages: 679936 kB
>> ShmemPmdMapped: 0 kB
>>
>> With ST_HUGE flag, QEMU can get huge page, then /proc/meminfo looks
>> like:
>>
>> ShmemHugePages: 77824 kB
>> ShmemPmdMapped: 6144 kB
>>
>> With this flag, the applications can know if huge page is supported on
>> the filesystem then optimize the behavior of the applications
>> accordingly. Although the similar function can be implemented in
>> applications by traversing the mount options, it looks more convenient
>> if kernel can provide such flag.
>>
>> Even though ST_HUGE is set, f_bsize still returns 4KB for tmpfs since
>> THP could be split, and it also my fallback to 4KB page silently if
>> there is not enough huge page.
>>
>> And, set the flag for hugetlbfs as well to keep the consistency, and the
>> applications don't have to know what filesystem is used to use huge
>> page, just need to check ST_HUGE flag.
>>
> Patch is simple enough, although I'm having trouble forming an opinion
> about it ;)
>
> It will call for an update to the statfs(2) manpage. I'm not sure
> which of linux-man@...r.kernel.org, mtk.manpages@...il.com and
> linux-api@...r.kernel.org is best for that, so I'd cc all three...
Thanks, Andrew. Added cc to those 3 lists.
Powered by blists - more mailing lists