[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20210804140805.vpuerwaiqtcvc5or@box.shutemov.name>
Date: Wed, 4 Aug 2021 17:08:05 +0300
From: "Kirill A. Shutemov" <kirill@...temov.name>
To: Hugh Dickins <hughd@...gle.com>
Cc: Andrew Morton <akpm@...ux-foundation.org>,
Shakeel Butt <shakeelb@...gle.com>,
"Kirill A. Shutemov" <kirill.shutemov@...ux.intel.com>,
Yang Shi <shy828301@...il.com>,
Miaohe Lin <linmiaohe@...wei.com>,
Mike Kravetz <mike.kravetz@...cle.com>,
Michal Hocko <mhocko@...e.com>,
Rik van Riel <riel@...riel.com>,
Christoph Hellwig <hch@...radead.org>,
Matthew Wilcox <willy@...radead.org>,
"Eric W. Biederman" <ebiederm@...ssion.com>,
Alexey Gladkov <legion@...nel.org>,
Chris Wilson <chris@...is-wilson.co.uk>,
Matthew Auld <matthew.auld@...el.com>,
linux-fsdevel@...r.kernel.org, linux-kernel@...r.kernel.org,
linux-api@...r.kernel.org, linux-mm@...ck.org
Subject: Re: [PATCH 08/16] huge tmpfs: fcntl(fd, F_HUGEPAGE) and fcntl(fd,
F_NOHUGEPAGE)
On Fri, Jul 30, 2021 at 12:48:33AM -0700, Hugh Dickins wrote:
> Add support for fcntl(fd, F_HUGEPAGE) and fcntl(fd, F_NOHUGEPAGE), to
> select hugeness per file: useful to override the default hugeness of the
> shmem mount, when occasionally needing to store a hugepage file in a
> smallpage mount or vice versa.
Hm. But why is the new MFD_* needed if the fcntl() can do the same.
> These fcntls just specify whether or not to try for huge pages when
> allocating to the object later: F_HUGEPAGE does not touch small pages
> already allocated (though khugepaged may do so when the file is mapped
> afterwards), F_NOHUGEPAGE does not split huge pages already allocated.
>
> Why fcntl? Because it's already in use (for sealing) on memfds; and I'm
> anxious to keep this simple, just applying it to whole files: fallocate,
> madvise and posix_fadvise each involve a range, which would need a new
> kind of tree attached to the inode for proper support.
Most of fadvise() operations ignore the range. I like fadvise() because
it's less prescriptive: kernel is free to ignore it.
--
Kirill A. Shutemov
Powered by blists - more mailing lists