[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID:
<SA1PR12MB7199C4F31576787D0400902EB045A@SA1PR12MB7199.namprd12.prod.outlook.com>
Date: Fri, 27 Jun 2025 05:03:13 +0000
From: Ankit Agrawal <ankita@...dia.com>
To: Jason Gunthorpe <jgg@...dia.com>, "maz@...nel.org" <maz@...nel.org>,
"oliver.upton@...ux.dev" <oliver.upton@...ux.dev>, "joey.gouly@....com"
<joey.gouly@....com>, "suzuki.poulose@....com" <suzuki.poulose@....com>,
"yuzenghui@...wei.com" <yuzenghui@...wei.com>, "catalin.marinas@....com"
<catalin.marinas@....com>, "will@...nel.org" <will@...nel.org>,
"ryan.roberts@....com" <ryan.roberts@....com>, "shahuang@...hat.com"
<shahuang@...hat.com>, "lpieralisi@...nel.org" <lpieralisi@...nel.org>,
"david@...hat.com" <david@...hat.com>, "ddutile@...hat.com"
<ddutile@...hat.com>, "seanjc@...gle.com" <seanjc@...gle.com>
CC: Aniket Agashe <aniketa@...dia.com>, Neo Jia <cjia@...dia.com>, Kirti
Wankhede <kwankhede@...dia.com>, Krishnakant Jaju <kjaju@...dia.com>, "Tarun
Gupta (SW-GPU)" <targupta@...dia.com>, Vikram Sethi <vsethi@...dia.com>, Andy
Currid <acurrid@...dia.com>, Alistair Popple <apopple@...dia.com>, John
Hubbard <jhubbard@...dia.com>, Dan Williams <danw@...dia.com>, Zhi Wang
<zhiw@...dia.com>, Matt Ochs <mochs@...dia.com>, Uday Dhoke
<udhoke@...dia.com>, Dheeraj Nigam <dnigam@...dia.com>,
"alex.williamson@...hat.com" <alex.williamson@...hat.com>,
"sebastianene@...gle.com" <sebastianene@...gle.com>, "coltonlewis@...gle.com"
<coltonlewis@...gle.com>, "kevin.tian@...el.com" <kevin.tian@...el.com>,
"yi.l.liu@...el.com" <yi.l.liu@...el.com>, "ardb@...nel.org"
<ardb@...nel.org>, "akpm@...ux-foundation.org" <akpm@...ux-foundation.org>,
"gshan@...hat.com" <gshan@...hat.com>, "linux-mm@...ck.org"
<linux-mm@...ck.org>, "tabba@...gle.com" <tabba@...gle.com>,
"qperret@...gle.com" <qperret@...gle.com>, "kvmarm@...ts.linux.dev"
<kvmarm@...ts.linux.dev>, "linux-kernel@...r.kernel.org"
<linux-kernel@...r.kernel.org>, "linux-arm-kernel@...ts.infradead.org"
<linux-arm-kernel@...ts.infradead.org>, "maobibo@...ngson.cn"
<maobibo@...ngson.cn>
Subject: Re: [PATCH v9 0/6] KVM: arm64: Map GPU device memory as cacheable
> Grace based platforms such as Grace Hopper/Blackwell Superchips have
> CPU accessible cache coherent GPU memory. The GPU device memory is
> essentially a DDR memory and retains properties such as cacheability,
> unaligned accesses, atomics and handling of executable faults. This
> requires the device memory to be mapped as NORMAL in stage-2.
>
> Today KVM forces the memory to either NORMAL or DEVICE_nGnRE depending
> on whether the memory region is added to the kernel. The KVM code is
> thus restrictive and prevents device memory that is not added to the
> kernel to be marked as cacheable. The patch aims to solve this.
>
> A cachebility check is made by consulting the VMA pgprot value. If
> the pgprot mapping type is cacheable, it is considered safe to be
> mapped cacheable as the KVM S2 will have the same Normal memory type
> as the VMA has in the S1 and KVM has no additional responsibility
> for safety.
>
> Note when FWB (Force Write Back) is not enabled, the kernel expects to
> trivially do cache management by flushing the memory by linearly
> converting a kvm_pte to phys_addr to a KVA. The cache management thus
> relies on memory being mapped. Since the GPU device memory is not kernel
> mapped, exit when the FWB is not supported. Similarly, ARM64_HAS_CACHE_DIC
> allows KVM to avoid flushing the icache and turns icache_inval_pou() into
> a NOP. So the cacheable PFNMAP is made contingent on these two hardware
> features.
>
> The ability to safely do the cacheable mapping of PFNMAP is exposed
> through a KVM capability for userspace consumption.
>
> The changes are heavily influenced by the discussions among
> maintainers Marc Zyngier and Oliver Upton besides Jason Gunthorpe,
> Catalin Marinas, David Hildenbrand, Sean Christopherson [1]. Many
> thanks for their valuable suggestions.
>
> Applied over next-20250610 and tested on the Grace Blackwell
> platform by booting up VM, loading NVIDIA module [2] and running
> nvidia-smi in the VM.
>
> To run CUDA workloads, there is a dependency on the IOMMUFD and the
> Nested Page Table patches being worked on separately by Nicolin Chen.
> (nicolinc@...dia.com). NVIDIA has provided git repositories which
> includes all the requisite kernel [3] and Qemu [4] patches in case
> one wants to try.
>
> v8 -> v9
> 1. Included MIXEDMAP to also be considered for cacheable mapping.
> (Jason Gunthorpe).
> 2. Minor text nits (Jason Gunthorpe).
A humble reminder for the review.
Powered by blists - more mailing lists