[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20220310230822.GO661808@dread.disaster.area>
Date: Fri, 11 Mar 2022 10:08:22 +1100
From: Dave Chinner <david@...morbit.com>
To: Chao Peng <chao.p.peng@...ux.intel.com>
Cc: kvm@...r.kernel.org, linux-kernel@...r.kernel.org,
linux-mm@...ck.org, linux-fsdevel@...r.kernel.org,
linux-api@...r.kernel.org, qemu-devel@...gnu.org,
Paolo Bonzini <pbonzini@...hat.com>,
Jonathan Corbet <corbet@....net>,
Sean Christopherson <seanjc@...gle.com>,
Vitaly Kuznetsov <vkuznets@...hat.com>,
Wanpeng Li <wanpengli@...cent.com>,
Jim Mattson <jmattson@...gle.com>,
Joerg Roedel <joro@...tes.org>,
Thomas Gleixner <tglx@...utronix.de>,
Ingo Molnar <mingo@...hat.com>, Borislav Petkov <bp@...en8.de>,
x86@...nel.org, "H . Peter Anvin" <hpa@...or.com>,
Hugh Dickins <hughd@...gle.com>,
Jeff Layton <jlayton@...nel.org>,
"J . Bruce Fields" <bfields@...ldses.org>,
Andrew Morton <akpm@...ux-foundation.org>,
Mike Rapoport <rppt@...nel.org>,
Steven Price <steven.price@....com>,
"Maciej S . Szmigiero" <mail@...iej.szmigiero.name>,
Vlastimil Babka <vbabka@...e.cz>,
Vishal Annapurve <vannapurve@...gle.com>,
Yu Zhang <yu.c.zhang@...ux.intel.com>,
"Kirill A . Shutemov" <kirill.shutemov@...ux.intel.com>,
luto@...nel.org, jun.nakajima@...el.com, dave.hansen@...el.com,
ak@...ux.intel.com, david@...hat.com
Subject: Re: [PATCH v5 03/13] mm/shmem: Support memfile_notifier
On Thu, Mar 10, 2022 at 10:09:01PM +0800, Chao Peng wrote:
> From: "Kirill A. Shutemov" <kirill.shutemov@...ux.intel.com>
>
> It maintains a memfile_notifier list in shmem_inode_info structure and
> implements memfile_pfn_ops callbacks defined by memfile_notifier. It
> then exposes them to memfile_notifier via
> shmem_get_memfile_notifier_info.
>
> We use SGP_NOALLOC in shmem_get_lock_pfn since the pages should be
> allocated by userspace for private memory. If there is no pages
> allocated at the offset then error should be returned so KVM knows that
> the memory is not private memory.
>
> Signed-off-by: Kirill A. Shutemov <kirill.shutemov@...ux.intel.com>
> Signed-off-by: Chao Peng <chao.p.peng@...ux.intel.com>
> ---
> include/linux/shmem_fs.h | 4 +++
> mm/shmem.c | 76 ++++++++++++++++++++++++++++++++++++++++
> 2 files changed, 80 insertions(+)
>
> diff --git a/include/linux/shmem_fs.h b/include/linux/shmem_fs.h
> index 2dde843f28ef..7bb16f2d2825 100644
> --- a/include/linux/shmem_fs.h
> +++ b/include/linux/shmem_fs.h
> @@ -9,6 +9,7 @@
> #include <linux/percpu_counter.h>
> #include <linux/xattr.h>
> #include <linux/fs_parser.h>
> +#include <linux/memfile_notifier.h>
>
> /* inode in-kernel data */
>
> @@ -28,6 +29,9 @@ struct shmem_inode_info {
> struct simple_xattrs xattrs; /* list of xattrs */
> atomic_t stop_eviction; /* hold when working on inode */
> unsigned int xflags; /* shmem extended flags */
> +#ifdef CONFIG_MEMFILE_NOTIFIER
> + struct memfile_notifier_list memfile_notifiers;
> +#endif
> struct inode vfs_inode;
> };
>
> diff --git a/mm/shmem.c b/mm/shmem.c
> index 9b31a7056009..7b43e274c9a2 100644
> --- a/mm/shmem.c
> +++ b/mm/shmem.c
> @@ -903,6 +903,28 @@ static struct folio *shmem_get_partial_folio(struct inode *inode, pgoff_t index)
> return page ? page_folio(page) : NULL;
> }
>
> +static void notify_fallocate(struct inode *inode, pgoff_t start, pgoff_t end)
> +{
> +#ifdef CONFIG_MEMFILE_NOTIFIER
> + struct shmem_inode_info *info = SHMEM_I(inode);
> +
> + memfile_notifier_fallocate(&info->memfile_notifiers, start, end);
> +#endif
> +}
*notify_populate(), not fallocate. This is a notification that a
range has been populated, not that the fallocate() syscall was run
to populate the backing store of a file.
i.e. fallocate is the name of a userspace filesystem API that can
be used to manipulate the backing store of a file in various ways.
It can both populate and punch away the backing store of a file, and
some operations that fallocate() can run will do both (e.g.
FALLOC_FL_ZERO_RANGE) and so could generate both
notify_invalidate() and a notify_populate() events.
Hence "fallocate" as an internal mm namespace or operation does not
belong anywhere in core MM infrastructure - it should never get used
anywhere other than the VFS/filesystem layers that implement the
fallocate() syscall or use it directly.
Cheers,
Dave.
--
Dave Chinner
david@...morbit.com
Powered by blists - more mailing lists