[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20251110130534.4d4b17ad.alex@shazbot.org>
Date: Mon, 10 Nov 2025 13:05:34 -0700
From: Alex Williamson <alex@...zbot.org>
To: Leon Romanovsky <leon@...nel.org>
Cc: Bjorn Helgaas <bhelgaas@...gle.com>,
Logan Gunthorpe <logang@...tatee.com>, Jens Axboe <axboe@...nel.dk>,
Robin Murphy <robin.murphy@....com>, Joerg Roedel <joro@...tes.org>,
Will Deacon <will@...nel.org>,
Marek Szyprowski <m.szyprowski@...sung.com>,
Jason Gunthorpe <jgg@...pe.ca>,
Andrew Morton <akpm@...ux-foundation.org>,
Jonathan Corbet <corbet@....net>, Sumit Semwal <sumit.semwal@...aro.org>,
Christian König <christian.koenig@....com>,
Kees Cook <kees@...nel.org>,
"Gustavo A. R. Silva" <gustavoars@...nel.org>,
Ankit Agrawal <ankita@...dia.com>, Yishai Hadas <yishaih@...dia.com>,
Shameer Kolothum <skolothumtho@...dia.com>,
Kevin Tian <kevin.tian@...el.com>, Krishnakant Jaju <kjaju@...dia.com>,
Matt Ochs <mochs@...dia.com>, linux-pci@...r.kernel.org,
linux-kernel@...r.kernel.org, linux-block@...r.kernel.org,
iommu@...ts.linux.dev, linux-mm@...ck.org, linux-doc@...r.kernel.org,
linux-media@...r.kernel.org, dri-devel@...ts.freedesktop.org,
linaro-mm-sig@...ts.linaro.org, kvm@...r.kernel.org,
linux-hardening@...r.kernel.org, Alex Mastro <amastro@...com>,
Nicolin Chen <nicolinc@...dia.com>
Subject: Re: [PATCH v7 11/11] vfio/nvgrace: Support get_dmabuf_phys
On Thu, 6 Nov 2025 16:16:56 +0200
Leon Romanovsky <leon@...nel.org> wrote:
> From: Jason Gunthorpe <jgg@...dia.com>
>
> Call vfio_pci_core_fill_phys_vec() with the proper physical ranges for the
> synthetic BAR 2 and BAR 4 regions. Otherwise use the normal flow based on
> the PCI bar.
>
> This demonstrates a DMABUF that follows the region info report to only
> allow mapping parts of the region that are mmapable. Since the BAR is
> power of two sized and the "CXL" region is just page aligned the there can
> be a padding region at the end that is not mmaped or passed into the
> DMABUF.
>
> The "CXL" ranges that are remapped into BAR 2 and BAR 4 areas are not PCI
> MMIO, they actually run over the CXL-like coherent interconnect and for
> the purposes of DMA behave identically to DRAM. We don't try to model this
> distinction between true PCI BAR memory that takes a real PCI path and the
> "CXL" memory that takes a different path in the p2p framework for now.
>
> Signed-off-by: Jason Gunthorpe <jgg@...dia.com>
> Tested-by: Alex Mastro <amastro@...com>
> Tested-by: Nicolin Chen <nicolinc@...dia.com>
> Signed-off-by: Leon Romanovsky <leonro@...dia.com>
> ---
> drivers/vfio/pci/nvgrace-gpu/main.c | 56 +++++++++++++++++++++++++++++++++++++
> 1 file changed, 56 insertions(+)
>
> diff --git a/drivers/vfio/pci/nvgrace-gpu/main.c b/drivers/vfio/pci/nvgrace-gpu/main.c
> index e346392b72f6..7d7ab2c84018 100644
> --- a/drivers/vfio/pci/nvgrace-gpu/main.c
> +++ b/drivers/vfio/pci/nvgrace-gpu/main.c
> @@ -7,6 +7,7 @@
> #include <linux/vfio_pci_core.h>
> #include <linux/delay.h>
> #include <linux/jiffies.h>
> +#include <linux/pci-p2pdma.h>
>
> /*
> * The device memory usable to the workloads running in the VM is cached
> @@ -683,6 +684,54 @@ nvgrace_gpu_write(struct vfio_device *core_vdev,
> return vfio_pci_core_write(core_vdev, buf, count, ppos);
> }
>
> +static int nvgrace_get_dmabuf_phys(struct vfio_pci_core_device *core_vdev,
> + struct p2pdma_provider **provider,
> + unsigned int region_index,
> + struct dma_buf_phys_vec *phys_vec,
> + struct vfio_region_dma_range *dma_ranges,
> + size_t nr_ranges)
> +{
> + struct nvgrace_gpu_pci_core_device *nvdev = container_of(
> + core_vdev, struct nvgrace_gpu_pci_core_device, core_device);
> + struct pci_dev *pdev = core_vdev->pdev;
> +
> + if (nvdev->resmem.memlength && region_index == RESMEM_REGION_INDEX) {
> + /*
> + * The P2P properties of the non-BAR memory is the same as the
> + * BAR memory, so just use the provider for index 0. Someday
> + * when CXL gets P2P support we could create CXLish providers
> + * for the non-BAR memory.
> + */
> + *provider = pcim_p2pdma_provider(pdev, 0);
> + if (!*provider)
> + return -EINVAL;
> + return vfio_pci_core_fill_phys_vec(phys_vec, dma_ranges,
> + nr_ranges,
> + nvdev->resmem.memphys,
> + nvdev->resmem.memlength);
> + } else if (region_index == USEMEM_REGION_INDEX) {
> + /*
> + * This is actually cachable memory and isn't treated as P2P in
> + * the chip. For now we have no way to push cachable memory
> + * through everything and the Grace HW doesn't care what caching
> + * attribute is programmed into the SMMU. So use BAR 0.
> + */
> + *provider = pcim_p2pdma_provider(pdev, 0);
> + if (!*provider)
> + return -EINVAL;
> + return vfio_pci_core_fill_phys_vec(phys_vec, dma_ranges,
> + nr_ranges,
> + nvdev->usemem.memphys,
> + nvdev->usemem.memlength);
> + }
> + return vfio_pci_core_get_dmabuf_phys(core_vdev, provider, region_index,
> + phys_vec, dma_ranges, nr_ranges);
> +}
Unless my eyes deceive, we could reduce the redundancy a bit:
struct mem_region *mem_region = NULL;
if (nvdev->resmem.memlength && region_index == RESMEM_REGION_INDEX) {
/*
* The P2P properties of the non-BAR memory is the same as the
* BAR memory, so just use the provider for index 0. Someday
* when CXL gets P2P support we could create CXLish providers
* for the non-BAR memory.
*/
mem_region = &nvdev->resmem;
} else if (region_index == USEMEM_REGION_INDEX) {
/*
* This is actually cachable memory and isn't treated as P2P in
* the chip. For now we have no way to push cachable memory
* through everything and the Grace HW doesn't care what caching
* attribute is programmed into the SMMU. So use BAR 0.
*/
mem_region = &nvdev->usemem;
}
if (mem_region) {
*provider = pcim_p2pdma_provider(pdev, 0);
if (!*provider)
return -EINVAL;
return vfio_pci_core_fill_phys_vec(phys_vec, dma_ranges,
nr_ranges,
mem_region->memphys,
mem_region->memlength);
}
return vfio_pci_core_get_dmabuf_phys(core_vdev, provider, region_index,
phys_vec, dma_ranges, nr_ranges);
Thanks,
Alex
Powered by blists - more mailing lists