[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20251118142849.GG17968@ziepe.ca>
Date: Tue, 18 Nov 2025 10:28:49 -0400
From: Jason Gunthorpe <jgg@...pe.ca>
To: "Tian, Kevin" <kevin.tian@...el.com>
Cc: Leon Romanovsky <leon@...nel.org>, Bjorn Helgaas <bhelgaas@...gle.com>,
Logan Gunthorpe <logang@...tatee.com>, Jens Axboe <axboe@...nel.dk>,
Robin Murphy <robin.murphy@....com>, Joerg Roedel <joro@...tes.org>,
Will Deacon <will@...nel.org>,
Marek Szyprowski <m.szyprowski@...sung.com>,
Andrew Morton <akpm@...ux-foundation.org>,
Jonathan Corbet <corbet@....net>,
Sumit Semwal <sumit.semwal@...aro.org>,
Christian König <christian.koenig@....com>,
Kees Cook <kees@...nel.org>,
"Gustavo A. R. Silva" <gustavoars@...nel.org>,
Ankit Agrawal <ankita@...dia.com>,
Yishai Hadas <yishaih@...dia.com>,
Shameer Kolothum <skolothumtho@...dia.com>,
Alex Williamson <alex@...zbot.org>,
Krishnakant Jaju <kjaju@...dia.com>, Matt Ochs <mochs@...dia.com>,
"linux-pci@...r.kernel.org" <linux-pci@...r.kernel.org>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
"linux-block@...r.kernel.org" <linux-block@...r.kernel.org>,
"iommu@...ts.linux.dev" <iommu@...ts.linux.dev>,
"linux-mm@...ck.org" <linux-mm@...ck.org>,
"linux-doc@...r.kernel.org" <linux-doc@...r.kernel.org>,
"linux-media@...r.kernel.org" <linux-media@...r.kernel.org>,
"dri-devel@...ts.freedesktop.org" <dri-devel@...ts.freedesktop.org>,
"linaro-mm-sig@...ts.linaro.org" <linaro-mm-sig@...ts.linaro.org>,
"kvm@...r.kernel.org" <kvm@...r.kernel.org>,
"linux-hardening@...r.kernel.org" <linux-hardening@...r.kernel.org>,
"Kasireddy, Vivek" <vivek.kasireddy@...el.com>
Subject: Re: [PATCH v8 10/11] vfio/pci: Add dma-buf export support for MMIO
regions
On Tue, Nov 18, 2025 at 07:33:23AM +0000, Tian, Kevin wrote:
> > From: Leon Romanovsky <leon@...nel.org>
> > Sent: Tuesday, November 11, 2025 5:58 PM
> >
> > - if (!new_mem)
> > + if (!new_mem) {
> > vfio_pci_zap_and_down_write_memory_lock(vdev);
> > - else
> > + vfio_pci_dma_buf_move(vdev, true);
> > + } else {
> > down_write(&vdev->memory_lock);
> > + }
>
> shouldn't we notify move before zapping the bars? otherwise there is
> still a small window in between where the exporter already has the
> mapping cleared while the importer still keeps it...
zapping the VMA and moving/revoking the DMABUF are independent
operations that can happen in any order. They effect different kinds
of users. The VMA zap prevents CPU access from userspace, the DMABUF
move prevents DMA access from devices.
The order has to be like the above because vfio_pci_dma_buf_move()
must be called under the memory lock and
vfio_pci_zap_and_down_write_memory_lock() gets the memory lock..
> > +static void vfio_pci_dma_buf_release(struct dma_buf *dmabuf)
> > +{
> > + struct vfio_pci_dma_buf *priv = dmabuf->priv;
> > +
> > + /*
> > + * Either this or vfio_pci_dma_buf_cleanup() will remove from the list.
> > + * The refcount prevents both.
>
> which refcount? I thought it's vdev->memory_lock preventing the race...
Refcount on the dmabuf
> > +int vfio_pci_core_fill_phys_vec(struct dma_buf_phys_vec *phys_vec,
> > + struct vfio_region_dma_range *dma_ranges,
> > + size_t nr_ranges, phys_addr_t start,
> > + phys_addr_t len)
> > +{
> > + phys_addr_t max_addr;
> > + unsigned int i;
> > +
> > + max_addr = start + len;
> > + for (i = 0; i < nr_ranges; i++) {
> > + phys_addr_t end;
> > +
> > + if (!dma_ranges[i].length)
> > + return -EINVAL;
>
> Looks redundant as there is already a check in validate_dmabuf_input().
Agree
> > +int vfio_pci_core_feature_dma_buf(struct vfio_pci_core_device *vdev, u32
> > flags,
> > + struct vfio_device_feature_dma_buf __user
> > *arg,
> > + size_t argsz)
> > +{
> > + struct vfio_device_feature_dma_buf get_dma_buf = {};
> > + struct vfio_region_dma_range *dma_ranges;
> > + DEFINE_DMA_BUF_EXPORT_INFO(exp_info);
> > + struct vfio_pci_dma_buf *priv;
> > + size_t length;
> > + int ret;
> > +
> > + if (!vdev->pci_ops || !vdev->pci_ops->get_dmabuf_phys)
> > + return -EOPNOTSUPP;
> > +
> > + ret = vfio_check_feature(flags, argsz, VFIO_DEVICE_FEATURE_GET,
> > + sizeof(get_dma_buf));
> > + if (ret != 1)
> > + return ret;
> > +
> > + if (copy_from_user(&get_dma_buf, arg, sizeof(get_dma_buf)))
> > + return -EFAULT;
> > +
> > + if (!get_dma_buf.nr_ranges || get_dma_buf.flags)
> > + return -EINVAL;
>
> unknown flag bits get -EOPNOTSUPP.
Agree
> > +
> > +void vfio_pci_dma_buf_cleanup(struct vfio_pci_core_device *vdev)
> > +{
> > + struct vfio_pci_dma_buf *priv;
> > + struct vfio_pci_dma_buf *tmp;
> > +
> > + down_write(&vdev->memory_lock);
> > + list_for_each_entry_safe(priv, tmp, &vdev->dmabufs, dmabufs_elm)
> > {
> > + if (!get_file_active(&priv->dmabuf->file))
> > + continue;
> > +
> > + dma_resv_lock(priv->dmabuf->resv, NULL);
> > + list_del_init(&priv->dmabufs_elm);
> > + priv->vdev = NULL;
> > + priv->revoked = true;
> > + dma_buf_move_notify(priv->dmabuf);
> > + dma_resv_unlock(priv->dmabuf->resv);
> > + vfio_device_put_registration(&vdev->vdev);
> > + fput(priv->dmabuf->file);
>
> dma_buf_put(priv->dmabuf), consistent with other places.
Someone else said this, I don't agree, the above got the get via
get_file_active() instead of a dma_buf version..
So we should pair with get_file_active() vs fput().
Christian rejected the idea of adding a dmabuf wrapper for
get_file_active(), oh well.
> > +struct vfio_device_feature_dma_buf {
> > + __u32 region_index;
> > + __u32 open_flags;
> > + __u32 flags;
>
> Usually the 'flags' field is put in the start (following argsz if existing).
Yeah, but doesn't really matter.
Thanks,
Jason
Powered by blists - more mailing lists