linux-kernel - Re: [PATCH v3 0/6] Expose GPU memory as coherently CPU accessible

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <ZDapsz2QOdjhcBHJ@nvidia.com>
Date:   Wed, 12 Apr 2023 09:53:07 -0300
From:   Jason Gunthorpe <jgg@...dia.com>
To:     Marc Zyngier <maz@...nel.org>
Cc:     ankita@...dia.com, alex.williamson@...hat.com,
        naoya.horiguchi@....com, oliver.upton@...ux.dev,
        aniketa@...dia.com, cjia@...dia.com, kwankhede@...dia.com,
        targupta@...dia.com, vsethi@...dia.com, acurrid@...dia.com,
        apopple@...dia.com, jhubbard@...dia.com, danw@...dia.com,
        kvm@...r.kernel.org, linux-kernel@...r.kernel.org,
        linux-arm-kernel@...ts.infradead.org, linux-mm@...ck.org
Subject: Re: [PATCH v3 0/6] Expose GPU memory as coherently CPU accessible

On Wed, Apr 12, 2023 at 01:28:08PM +0100, Marc Zyngier wrote:
> On Wed, 05 Apr 2023 19:01:28 +0100,
> <ankita@...dia.com> wrote:
> > 
> > From: Ankit Agrawal <ankita@...dia.com>
> > 
> > NVIDIA's upcoming Grace Hopper Superchip provides a PCI-like device
> > for the on-chip GPU that is the logical OS representation of the
> > internal propritary cache coherent interconnect.
> > 
> > This representation has a number of limitations compared to a real PCI
> > device, in particular, it does not model the coherent GPU memory
> > aperture as a PCI config space BAR, and PCI doesn't know anything
> > about cacheable memory types.
> > 
> > Provide a VFIO PCI variant driver that adapts the unique PCI
> > representation into a more standard PCI representation facing
> > userspace. The GPU memory aperture is obtained from ACPI, according to
> > the FW specification, and exported to userspace as the VFIO_REGION
> > that covers the first PCI BAR. qemu will naturally generate a PCI
> > device in the VM where the cacheable aperture is reported in BAR1.
> > 
> > Since this memory region is actually cache coherent with the CPU, the
> > VFIO variant driver will mmap it into VMA using a cacheable mapping.
> > 
> > As this is the first time an ARM environment has placed cacheable
> > non-struct page backed memory (eg from remap_pfn_range) into a KVM
> > page table, fix a bug in ARM KVM where it does not copy the cacheable
> > memory attributes from non-struct page backed PTEs to ensure the guest
> > also gets a cacheable mapping.
> 
> This is not a bug, but a conscious design decision. As you pointed out
> above, nothing needed this until now, and a device mapping is the only
> safe thing to do as we know exactly *nothing* about the memory that
> gets mapped.

IMHO, from the mm perspective, the bug is using pfn_is_map_memory() to
determine the cachability or device memory status of a PFN in a
VMA. That is not what that API is for.

The cachability should be determined by the pgprot bits in the VMA.

VM_IO is the flag that says the VMA maps memory with side-effects.

I understand in ARM KVM it is not allowed for the VM and host to have
different cachability, so mis-detecting host cachable memory and
making it forced non-cachable in the VM is not a safe thing to do?

Jason