lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <ac3ae324-4c65-432a-8c6d-2af988b18ac8@lucifer.local>
Date: Tue, 10 Jun 2025 11:06:31 +0100
From: Lorenzo Stoakes <lorenzo.stoakes@...cle.com>
To: Andrew Morton <akpm@...ux-foundation.org>
Cc: Alexander Viro <viro@...iv.linux.org.uk>,
        Christian Brauner <brauner@...nel.org>, Jan Kara <jack@...e.cz>,
        "Liam R . Howlett" <Liam.Howlett@...cle.com>,
        Vlastimil Babka <vbabka@...e.cz>, Jann Horn <jannh@...gle.com>,
        Pedro Falcato <pfalcato@...e.de>, linux-fsdevel@...r.kernel.org,
        linux-kernel@...r.kernel.org, linux-mm@...ck.org
Subject: Re: [PATCH v2] mm: add mmap_prepare() compatibility layer for nested
 file systems

Andrew - I realise this has been a comedy of errors :>) so let me know if you
want me to just respin.

But anyway, I realise I left in a couple decls by mistake. They're harmless, but
silly. I attach a fixpatch to fix them, please apply, thanks!

Cheers, Lorenzo

On Mon, Jun 09, 2025 at 07:37:33PM +0100, Lorenzo Stoakes wrote:
> Andrew - apologies, please attach the tags as below which I mistakenly didn't
> propagate...
>
> Thanks!
>
> (note to self, figure out b4 :P)
>
> On Mon, Jun 09, 2025 at 05:57:49PM +0100, Lorenzo Stoakes wrote:
> > Nested file systems, that is those which invoke call_mmap() within their
> > own f_op->mmap() handlers, may encounter underlying file systems which
> > provide the f_op->mmap_prepare() hook introduced by commit
> > c84bf6dd2b83 ("mm: introduce new .mmap_prepare() file callback").
> >
> > We have a chicken-and-egg scenario here - until all file systems are
> > converted to using .mmap_prepare(), we cannot convert these nested
> > handlers, as we can't call f_op->mmap from an .mmap_prepare() hook.
> >
> > So we have to do it the other way round - invoke the .mmap_prepare() hook
> > from an .mmap() one.
> >
> > in order to do so, we need to convert VMA state into a struct vm_area_desc
> > descriptor, invoking the underlying file system's f_op->mmap_prepare()
> > callback passing a pointer to this, and then setting VMA state accordingly
> > and safely.
> >
> > This patch achieves this via the compat_vma_mmap_prepare() function, which
> > we invoke from call_mmap() if f_op->mmap_prepare() is specified in the
> > passed in file pointer.
> >
> > We place the fundamental logic into mm/vma.h where VMA manipulation
> > belongs. We also update the VMA userland tests to accommodate the changes.
> >
> > The compat_vma_mmap_prepare() function and its associated machinery is
> > temporary, and will be removed once the conversion of file systems is
> > complete.
> >
> > We carefully place this code so it can be used with CONFIG_MMU and also
> > with cutting edge nommu silicon.
> >
> > Signed-off-by: Lorenzo Stoakes <lorenzo.stoakes@...cle.com>
> > Reported-by: Jann Horn <jannh@...gle.com>
> > Closes: https://lore.kernel.org/linux-mm/CAG48ez04yOEVx1ekzOChARDDBZzAKwet8PEoPM4Ln3_rk91AzQ@mail.gmail.com/
> > Fixes: c84bf6dd2b83 ("mm: introduce new .mmap_prepare() file callback").
>
> Reviewed-by: Pedro Falcato <pfalcato@...e.de>
> Reviewed-by: Vlastimil Babka <vbabka@...e.cz>
>
> > ---
> >
> > Apologies for the quick turn-around here, but I'm keen to address the silly
> > kernel-doc and nommu issues here.
> >
> > v2:
> > * Propagated tags (thanks everyone!)
> > * Corrected nommu issue by carefully positioning code in mm/util.c and mm/vma.h.
> > * Fixed ';' typo in kernel-doc comment.
> >
> > v1:
> > https://lore.kernel.org/all/20250609092413.45435-1-lorenzo.stoakes@oracle.com/
> >
> >  include/linux/fs.h               |  6 ++--
> >  mm/util.c                        | 39 ++++++++++++++++++++++++
> >  mm/vma.c                         |  1 -
> >  mm/vma.h                         | 51 ++++++++++++++++++++++++++++++++
> >  tools/testing/vma/vma_internal.h | 16 ++++++++++
> >  5 files changed, 110 insertions(+), 3 deletions(-)
> >
> > diff --git a/include/linux/fs.h b/include/linux/fs.h
> > index 05abdabe9db7..8fe41a2b7527 100644
> > --- a/include/linux/fs.h
> > +++ b/include/linux/fs.h
> > @@ -2274,10 +2274,12 @@ static inline bool file_has_valid_mmap_hooks(struct file *file)
> >  	return true;
> >  }
> >
> > +int compat_vma_mmap_prepare(struct file *file, struct vm_area_struct *vma);
> > +
> >  static inline int call_mmap(struct file *file, struct vm_area_struct *vma)
> >  {
> > -	if (WARN_ON_ONCE(file->f_op->mmap_prepare))
> > -		return -EINVAL;
> > +	if (file->f_op->mmap_prepare)
> > +		return compat_vma_mmap_prepare(file, vma);
> >
> >  	return file->f_op->mmap(file, vma);
> >  }
> > diff --git a/mm/util.c b/mm/util.c
> > index 448117da071f..23a9bc26ef68 100644
> > --- a/mm/util.c
> > +++ b/mm/util.c
> > @@ -1131,3 +1131,42 @@ void flush_dcache_folio(struct folio *folio)
> >  }
> >  EXPORT_SYMBOL(flush_dcache_folio);
> >  #endif
> > +
> > +/**
> > + * compat_vma_mmap_prepare() - Apply the file's .mmap_prepare() hook to an
> > + * existing VMA
> > + * @file: The file which possesss an f_op->mmap_prepare() hook
> > + * @vma: The VMA to apply the .mmap_prepare() hook to.
> > + *
> > + * Ordinarily, .mmap_prepare() is invoked directly upon mmap(). However, certain
> > + * 'wrapper' file systems invoke a nested mmap hook of an underlying file.
> > + *
> > + * Until all filesystems are converted to use .mmap_prepare(), we must be
> > + * conservative and continue to invoke these 'wrapper' filesystems using the
> > + * deprecated .mmap() hook.
> > + *
> > + * However we have a problem if the underlying file system possesses an
> > + * .mmap_prepare() hook, as we are in a different context when we invoke the
> > + * .mmap() hook, already having a VMA to deal with.
> > + *
> > + * compat_vma_mmap_prepare() is a compatibility function that takes VMA state,
> > + * establishes a struct vm_area_desc descriptor, passes to the underlying
> > + * .mmap_prepare() hook and applies any changes performed by it.
> > + *
> > + * Once the conversion of filesystems is complete this function will no longer
> > + * be required and will be removed.
> > + *
> > + * Returns: 0 on success or error.
> > + */
> > +int compat_vma_mmap_prepare(struct file *file, struct vm_area_struct *vma)
> > +{
> > +	struct vm_area_desc desc;
> > +	int err;
> > +
> > +	err = file->f_op->mmap_prepare(vma_to_desc(vma, &desc));
> > +	if (err)
> > +		return err;
> > +	set_vma_from_desc(vma, &desc);
> > +
> > +	return 0;
> > +}
> > diff --git a/mm/vma.c b/mm/vma.c
> > index 01b1d26d87b4..3cdd0aaa10aa 100644
> > --- a/mm/vma.c
> > +++ b/mm/vma.c
> > @@ -3153,7 +3153,6 @@ int __vm_munmap(unsigned long start, size_t len, bool unlock)
> >  	return ret;
> >  }
> >
> > -
> >  /* Insert vm structure into process list sorted by address
> >   * and into the inode's i_mmap tree.  If vm_file is non-NULL
> >   * then i_mmap_rwsem is taken here.
> > diff --git a/mm/vma.h b/mm/vma.h
> > index 0db066e7a45d..d92e6c906c96 100644
> > --- a/mm/vma.h
> > +++ b/mm/vma.h
> > @@ -222,6 +222,53 @@ static inline int vma_iter_store_gfp(struct vma_iterator *vmi,
> >  	return 0;
> >  }
> >
> > +
> > +/*
> > + * Temporary helper functions for file systems which wrap an invocation of
> > + * f_op->mmap() but which might have an underlying file system which implements
> > + * f_op->mmap_prepare().
> > + */
> > +
> > +static inline struct vm_area_desc *vma_to_desc(struct vm_area_struct *vma,
> > +		struct vm_area_desc *desc)
> > +{
> > +	desc->mm = vma->vm_mm;
> > +	desc->start = vma->vm_start;
> > +	desc->end = vma->vm_end;
> > +
> > +	desc->pgoff = vma->vm_pgoff;
> > +	desc->file = vma->vm_file;
> > +	desc->vm_flags = vma->vm_flags;
> > +	desc->page_prot = vma->vm_page_prot;
> > +
> > +	desc->vm_ops = NULL;
> > +	desc->private_data = NULL;
> > +
> > +	return desc;
> > +}
> > +
> > +static inline void set_vma_from_desc(struct vm_area_struct *vma,
> > +		struct vm_area_desc *desc)
> > +{
> > +	/*
> > +	 * Since we're invoking .mmap_prepare() despite having a partially
> > +	 * established VMA, we must take care to handle setting fields
> > +	 * correctly.
> > +	 */
> > +
> > +	/* Mutable fields. Populated with initial state. */
> > +	vma->vm_pgoff = desc->pgoff;
> > +	if (vma->vm_file != desc->file)
> > +		vma_set_file(vma, desc->file);
> > +	if (vma->vm_flags != desc->vm_flags)
> > +		vm_flags_set(vma, desc->vm_flags);
> > +	vma->vm_page_prot = desc->page_prot;
> > +
> > +	/* User-defined fields. */
> > +	vma->vm_ops = desc->vm_ops;
> > +	vma->vm_private_data = desc->private_data;
> > +}
> > +
> >  int
> >  do_vmi_align_munmap(struct vma_iterator *vmi, struct vm_area_struct *vma,
> >  		    struct mm_struct *mm, unsigned long start,
> > @@ -570,4 +617,8 @@ int create_init_stack_vma(struct mm_struct *mm, struct vm_area_struct **vmap,
> >  int relocate_vma_down(struct vm_area_struct *vma, unsigned long shift);
> >  #endif
> >
> > +struct vm_area_desc *vma_to_desc(struct vm_area_struct *vma,
> > +		struct vm_area_desc *desc);
> > +void set_vma_from_desc(struct vm_area_struct *vma, struct vm_area_desc *desc);
> > +
> >  #endif	/* __MM_VMA_H */
> > diff --git a/tools/testing/vma/vma_internal.h b/tools/testing/vma/vma_internal.h
> > index 77b2949d874a..675a55216607 100644
> > --- a/tools/testing/vma/vma_internal.h
> > +++ b/tools/testing/vma/vma_internal.h
> > @@ -159,6 +159,14 @@ typedef __bitwise unsigned int vm_fault_t;
> >
> >  #define ASSERT_EXCLUSIVE_WRITER(x)
> >
> > +/**
> > + * swap - swap values of @a and @b
> > + * @a: first value
> > + * @b: second value
> > + */
> > +#define swap(a, b) \
> > +	do { typeof(a) __tmp = (a); (a) = (b); (b) = __tmp; } while (0)
> > +
> >  struct kref {
> >  	refcount_t refcount;
> >  };
> > @@ -1479,4 +1487,12 @@ static inline vm_flags_t ksm_vma_flags(const struct mm_struct *, const struct fi
> >  	return vm_flags;
> >  }
> >
> > +static inline void vma_set_file(struct vm_area_struct *vma, struct file *file)
> > +{
> > +	/* Changing an anonymous vma with this is illegal */
> > +	get_file(file);
> > +	swap(vma->vm_file, file);
> > +	fput(file);
> > +}
> > +
> >  #endif	/* __MM_VMA_INTERNAL_H */
> > --
> > 2.49.0
> >

----8<----
>From 0a2c0811649417e91804ea81f610e896e2ea7aba Mon Sep 17 00:00:00 2001
From: Lorenzo Stoakes <lorenzo.stoakes@...cle.com>
Date: Tue, 10 Jun 2025 11:03:11 +0100
Subject: [PATCH] fix

Signed-off-by: Lorenzo Stoakes <lorenzo.stoakes@...cle.com>
---
 mm/vma.h | 4 ----
 1 file changed, 4 deletions(-)

diff --git a/mm/vma.h b/mm/vma.h
index d92e6c906c96..f47112a352db 100644
--- a/mm/vma.h
+++ b/mm/vma.h
@@ -617,8 +617,4 @@ int create_init_stack_vma(struct mm_struct *mm, struct vm_area_struct **vmap,
 int relocate_vma_down(struct vm_area_struct *vma, unsigned long shift);
 #endif

-struct vm_area_desc *vma_to_desc(struct vm_area_struct *vma,
-		struct vm_area_desc *desc);
-void set_vma_from_desc(struct vm_area_struct *vma, struct vm_area_desc *desc);
-
 #endif	/* __MM_VMA_H */
--
2.49.0

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ