lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Wed, 25 Oct 2023 08:20:19 -0600
From:   Alex Williamson <>
To:     Ankit Agrawal <>
Cc:     "Tian, Kevin" <>,
        Jason Gunthorpe <>,
        Yishai Hadas <>,
        Aniket Agashe <>, Neo Jia <>,
        Kirti Wankhede <>,
        "Tarun Gupta (SW-GPU)" <>,
        Vikram Sethi <>,
        Andy Currid <>,
        Alistair Popple <>,
        John Hubbard <>,
        Dan Williams <>,
        "Anuj Aggarwal (SW-GPU)" <>,
        "" <>,
        "" <>
Subject: Re: [PATCH v12 1/1] vfio/nvgpu: Add vfio pci variant module for
 grace hopper

On Wed, 25 Oct 2023 12:43:24 +0000
Ankit Agrawal <> wrote:

> While the physical BAR is present on the device, it is not being used on the host
> system. The access to the device memory region on the host occur through the
> C2C interconnect link (not the PCIe) and is present for access as a separate memory
> region in the physical address space on the host. The variant driver queries this range
> from the host ACPI DSD tables.

BTW, it's still never been answered why the latest QEMU series dropped
the _DSD support.

> Now, this device memory region on the host is exposed as a device BAR in the VM.
> So the device BAR in the VM is actually mapped to the device memory region in the
> physical address space (and not to the physical BAR) on the host. The config space
> accesses to the device however, are still going to the physical BAR on the host.
> > Does this BAR2 size match the size we're reporting for the region?  Now
> > I'm confused why we need to intercept the BAR2 region info if there's
> > physically a real BAR behind it.  Thanks,  
> Yes, it does match the size being reported through region info. But the region
> info ioctl is still intercepted to provide additional cap to establish the sparse
> mapping. Why we do sparse mapping? The actual device memory size is not
> power-of-2 aligned (a requirement for a BAR). So we roundup to the next
> power-of-2 value and report the size as such. Then we utilize sparse mapping
> to show only the actual size of the device memory as mappable.

Yes, it's clear to me why we need the sparse mapping and why we
intercept the accesses, but the fact that there's an underlying
physical BAR of the same size in config space has been completely
absent in any previous discussions.

In light of that, I don't think we should be independently calculating
the BAR2 region size using roundup_pow_of_two(nvdev->memlength).
Instead we should be using pci_resource_len() of the physical BAR2 to
make it evident that this relationship exists.

The comments throughout should also be updated to reflect this as
currently they're written as if there is no physical BAR2 and we're
making a completely independent decision relative to BAR2 sizing.  A
comment should also be added to nvgrace_gpu_vfio_pci_read/write()
explaining that the physical BAR2 provides the correct behavior
relative to config space accesses.

The probe function should also fail if pci_resource_len() for BAR2 is
not sufficient for the coherent memory region.  Thanks,


Powered by blists - more mailing lists