lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <9f88366f-84ac-4210-bbf0-b27cec284572@lucifer.local>
Date: Wed, 17 Sep 2025 16:34:13 +0100
From: Lorenzo Stoakes <lorenzo.stoakes@...cle.com>
To: Pedro Falcato <pfalcato@...e.de>
Cc: Andrew Morton <akpm@...ux-foundation.org>,
        Jonathan Corbet <corbet@....net>, Matthew Wilcox <willy@...radead.org>,
        Guo Ren <guoren@...nel.org>,
        Thomas Bogendoerfer <tsbogend@...ha.franken.de>,
        Heiko Carstens <hca@...ux.ibm.com>, Vasily Gorbik <gor@...ux.ibm.com>,
        Alexander Gordeev <agordeev@...ux.ibm.com>,
        Christian Borntraeger <borntraeger@...ux.ibm.com>,
        Sven Schnelle <svens@...ux.ibm.com>,
        "David S . Miller" <davem@...emloft.net>,
        Andreas Larsson <andreas@...sler.com>, Arnd Bergmann <arnd@...db.de>,
        Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
        Dan Williams <dan.j.williams@...el.com>,
        Vishal Verma <vishal.l.verma@...el.com>,
        Dave Jiang <dave.jiang@...el.com>, Nicolas Pitre <nico@...xnic.net>,
        Muchun Song <muchun.song@...ux.dev>,
        Oscar Salvador <osalvador@...e.de>,
        David Hildenbrand <david@...hat.com>,
        Konstantin Komarov <almaz.alexandrovich@...agon-software.com>,
        Baoquan He <bhe@...hat.com>, Vivek Goyal <vgoyal@...hat.com>,
        Dave Young <dyoung@...hat.com>, Tony Luck <tony.luck@...el.com>,
        Reinette Chatre <reinette.chatre@...el.com>,
        Dave Martin <Dave.Martin@....com>, James Morse <james.morse@....com>,
        Alexander Viro <viro@...iv.linux.org.uk>,
        Christian Brauner <brauner@...nel.org>, Jan Kara <jack@...e.cz>,
        "Liam R . Howlett" <Liam.Howlett@...cle.com>,
        Vlastimil Babka <vbabka@...e.cz>, Mike Rapoport <rppt@...nel.org>,
        Suren Baghdasaryan <surenb@...gle.com>, Michal Hocko <mhocko@...e.com>,
        Hugh Dickins <hughd@...gle.com>,
        Baolin Wang <baolin.wang@...ux.alibaba.com>,
        Uladzislau Rezki <urezki@...il.com>,
        Dmitry Vyukov <dvyukov@...gle.com>,
        Andrey Konovalov <andreyknvl@...il.com>, Jann Horn <jannh@...gle.com>,
        linux-doc@...r.kernel.org, linux-kernel@...r.kernel.org,
        linux-fsdevel@...r.kernel.org, linux-csky@...r.kernel.org,
        linux-mips@...r.kernel.org, linux-s390@...r.kernel.org,
        sparclinux@...r.kernel.org, nvdimm@...ts.linux.dev,
        linux-cxl@...r.kernel.org, linux-mm@...ck.org, ntfs3@...ts.linux.dev,
        kexec@...ts.infradead.org, kasan-dev@...glegroups.com,
        Jason Gunthorpe <jgg@...dia.com>, iommu@...ts.linux.dev,
        Kevin Tian <kevin.tian@...el.com>, Will Deacon <will@...nel.org>,
        Robin Murphy <robin.murphy@....com>
Subject: Re: [PATCH v3 08/13] mm: add ability to take further action in
 vm_area_desc

On Wed, Sep 17, 2025 at 12:32:10PM +0100, Pedro Falcato wrote:
> On Tue, Sep 16, 2025 at 03:11:54PM +0100, Lorenzo Stoakes wrote:
> > Some drivers/filesystems need to perform additional tasks after the VMA is
> > set up.  This is typically in the form of pre-population.
> >
> > The forms of pre-population most likely to be performed are a PFN remap
> > or the insertion of normal folios and PFNs into a mixed map.
> >
> > We start by implementing the PFN remap functionality, ensuring that we
> > perform the appropriate actions at the appropriate time - that is setting
> > flags at the point of .mmap_prepare, and performing the actual remap at the
> > point at which the VMA is fully established.
> >
> > This prevents the driver from doing anything too crazy with a VMA at any
> > stage, and we retain complete control over how the mm functionality is
> > applied.
> >
> > Unfortunately callers still do often require some kind of custom action,
> > so we add an optional success/error _hook to allow the caller to do
> > something after the action has succeeded or failed.
>
> Do we have any idea for rules regarding ->mmap_prepare() and ->*_hook()?
> It feels spooky to e.g grab locks in mmap_prepare, and hold them across core
> mmap(). And I guess it might be needed?

I already did a bunch of logic around this, but several respins later and we
don't curently support it as Jason pointed out probably we actually don't need
to, at least so far.

I don't think it's really worth saying 'do this don't do that'. As wayward
drivers will do whatever.

Sadly we do need those hooks because of error filtering and e.g. debug output on
success.

However on success though, I discourage anything too stupid by making the vma
parameter const so you'd have to do a const cast there.

On error you only get the error code so good luck with that.

Obviously there could be a static mutex, but I think that's unavoidable.

>
> >
> > This is done at the point when the VMA has already been established, so
> > the harm that can be done is limited.
> >
> > The error hook can be used to filter errors if necessary.
> >
> > If any error arises on these final actions, we simply unmap the VMA
> > altogether.
> >
> > Also update the stacked filesystem compatibility layer to utilise the
> > action behaviour, and update the VMA tests accordingly.
> >
> > Signed-off-by: Lorenzo Stoakes <lorenzo.stoakes@...cle.com>
> <snip>
> > diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h
> > index 31b27086586d..aa1e2003f366 100644
> > --- a/include/linux/mm_types.h
> > +++ b/include/linux/mm_types.h
> > @@ -775,6 +775,49 @@ struct pfnmap_track_ctx {
> >  };
> >  #endif
> >
> > +/* What action should be taken after an .mmap_prepare call is complete? */
> > +enum mmap_action_type {
> > +	MMAP_NOTHING,		/* Mapping is complete, no further action. */
> > +	MMAP_REMAP_PFN,		/* Remap PFN range. */
> > +};
> > +
> > +/*
> > + * Describes an action an mmap_prepare hook can instruct to be taken to complete
> > + * the mapping of a VMA. Specified in vm_area_desc.
> > + */
> > +struct mmap_action {
> > +	union {
> > +		/* Remap range. */
> > +		struct {
> > +			unsigned long start;
> > +			unsigned long start_pfn;
> > +			unsigned long size;
> > +			pgprot_t pgprot;
> > +			bool is_io_remap;
> > +		} remap;
> > +	};
> > +	enum mmap_action_type type;
> > +
> > +	/*
> > +	 * If specified, this hook is invoked after the selected action has been
> > +	 * successfully completed. Note that the VMA write lock still held.
> > +	 *
> > +	 * The absolute minimum ought to be done here.
> > +	 *
> > +	 * Returns 0 on success, or an error code.
> > +	 */
> > +	int (*success_hook)(const struct vm_area_struct *vma);
> > +
> > +	/*
> > +	 * If specified, this hook is invoked when an error occurred when
> > +	 * attempting the selection action.
> > +	 *
> > +	 * The hook can return an error code in order to filter the error, but
> > +	 * it is not valid to clear the error here.
> > +	 */
> > +	int (*error_hook)(int err);
>
> Do we need two hooks? It might be more ergonomic to simply have a:
>
> 	int (*finish)(int err);
>
>
> 	int random_driver_finish(int err)
> 	{
> 		if (err)
> 			pr_err("ahhhhhhhhh\n");
> 		mutex_unlock(&big_lock);
> 		return err;
> 	}

No I think that's less clear. Better to spell it out.

>
> It's also unclear to me if/why we need the capability to switch error codes,
> but I might've missed some discussion on this.

There's drivers that do it. That's why.

>
>
> --
> Pedro

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ