linux-kernel - Re: [PATCH 1/3] vfio: Introduce vma ops registration and notifier

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20210212212057.GW4247@nvidia.com>
Date:   Fri, 12 Feb 2021 17:20:57 -0400
From:   Jason Gunthorpe <jgg@...dia.com>
To:     Alex Williamson <alex.williamson@...hat.com>
CC:     <cohuck@...hat.com>, <kvm@...r.kernel.org>,
        <linux-kernel@...r.kernel.org>, <peterx@...hat.com>
Subject: Re: [PATCH 1/3] vfio: Introduce vma ops registration and notifier

On Fri, Feb 12, 2021 at 12:27:39PM -0700, Alex Williamson wrote:
> Create an interface through vfio-core where a vfio bus driver (ex.
> vfio-pci) can register the vm_operations_struct it uses to map device
> memory, along with a set of registration callbacks.  This allows
> vfio-core to expose interfaces for IOMMU backends to match a
> vm_area_struct to a bus driver and register a notifier for relavant
> changes to the device mapping.  For now we define only a notifier
> action for closing the device.
> 
> Signed-off-by: Alex Williamson <alex.williamson@...hat.com>
>  drivers/vfio/vfio.c  |  120 ++++++++++++++++++++++++++++++++++++++++++++++++++
>  include/linux/vfio.h |   20 ++++++++
>  2 files changed, 140 insertions(+)
> 
> diff --git a/drivers/vfio/vfio.c b/drivers/vfio/vfio.c
> index 38779e6fd80c..568f5e37a95f 100644
> +++ b/drivers/vfio/vfio.c
> @@ -47,6 +47,8 @@ static struct vfio {
>  	struct cdev			group_cdev;
>  	dev_t				group_devt;
>  	wait_queue_head_t		release_q;
> +	struct list_head		vm_ops_list;
> +	struct mutex			vm_ops_lock;
>  } vfio;
>  
>  struct vfio_iommu_driver {
> @@ -2354,6 +2356,121 @@ struct iommu_domain *vfio_group_iommu_domain(struct vfio_group *group)
>  }
>  EXPORT_SYMBOL_GPL(vfio_group_iommu_domain);
>  
> +struct vfio_vma_ops {
> +	const struct vm_operations_struct	*vm_ops;
> +	vfio_register_vma_nb_t			*reg_fn;
> +	vfio_unregister_vma_nb_t		*unreg_fn;
> +	struct list_head			next;
> +};
> +
> +int vfio_register_vma_ops(const struct vm_operations_struct *vm_ops,
> +			  vfio_register_vma_nb_t *reg_fn,
> +			  vfio_unregister_vma_nb_t *unreg_fn)

This just feels a little bit too complicated

I've recently learned from Daniel that we can use the address_space
machinery to drive the zap_vma_ptes() via unmap_mapping_range(). This
technique replaces all the open, close and vma_list logic in vfio_pci

If we don't need open anymore, we could do something like this:

 static const struct vm_operations_struct vfio_pci_mmap_ops = {
        .open = vfio_pfn_open, // implemented in vfio.c
        .close = vfio_pfn_close,
        .fault = vfio_pci_mmap_fault,
 };

Then we could code the function needed:

struct vfio_pfn_range_handle 
{
       struct kref kref;
       struct vfio_device *vfio;
       struct notifier_block invalidation_cb;
       unsigned int flags;
}

struct vfio_pfn_range_handle *get_pfn_range(struct vm_area_struct *vma)
{
       struct vfio_pfn_range_handle *handle;

       if (vma->ops->open != vfio_pfn_open)
              return NULL;
       
       handle = vma->vm_private_data;
       if (test_bit(handle->flags, DMA_STOPPED)
              return NULL;
       kref_get(&handle->kref);
       return handle;
}

Where the common open/close only kref inc/dec the kref and all 'vfio'
VMAs always have a pointer to the same vfio_pfn_range_handle in their
private_data.

The vm_pgoff is already pointing at the physical pfn, so every part of
the system can get the information it needs fairly trivially.

Some stop access function is pretty simple looking

void stop_access(struct vfio_pfn_range_handle *handle)
{
      set_bit(handle->flags, DMA_STOPPED);
      unmap_mapping_range(handle->vfio->[..]->inode, 0, max, false);
      srcu_notifier_call_chain(handle->invalidation_cb, VFIO_VMA_NOTIFY_CLOSE, NULL);
}

(well, have to sort out the locking some more, but that is the
 general idea)

I think that would remove alot of the code added here and acts a lot
closer to how a someday dmabuf could act.

Also, this will need to update the nvlink vmops as well

Jason