[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20220407145654.GB3397825@nvidia.com>
Date: Thu, 7 Apr 2022 11:56:54 -0300
From: Jason Gunthorpe <jgg@...dia.com>
To: "Tian, Kevin" <kevin.tian@...el.com>
Cc: Alex Williamson <alex.williamson@...hat.com>,
Lu Baolu <baolu.lu@...ux.intel.com>,
Christian Benvenuti <benve@...co.com>,
Cornelia Huck <cohuck@...hat.com>,
David Woodhouse <dwmw2@...radead.org>,
Gerald Schaefer <gerald.schaefer@...ux.ibm.com>,
"iommu@...ts.linux-foundation.org" <iommu@...ts.linux-foundation.org>,
Jason Wang <jasowang@...hat.com>,
Joerg Roedel <joro@...tes.org>,
"kvm@...r.kernel.org" <kvm@...r.kernel.org>,
"linux-arm-kernel@...ts.infradead.org"
<linux-arm-kernel@...ts.infradead.org>,
"linux-arm-msm@...r.kernel.org" <linux-arm-msm@...r.kernel.org>,
"linux-rdma@...r.kernel.org" <linux-rdma@...r.kernel.org>,
"linux-s390@...r.kernel.org" <linux-s390@...r.kernel.org>,
Matthew Rosato <mjrosato@...ux.ibm.com>,
"Michael S. Tsirkin" <mst@...hat.com>,
Nelson Escobar <neescoba@...co.com>,
"netdev@...r.kernel.org" <netdev@...r.kernel.org>,
Rob Clark <robdclark@...il.com>,
Robin Murphy <robin.murphy@....com>,
Suravee Suthikulpanit <suravee.suthikulpanit@....com>,
"virtualization@...ts.linux-foundation.org"
<virtualization@...ts.linux-foundation.org>,
Will Deacon <will@...nel.org>, Christoph Hellwig <hch@....de>
Subject: Re: [PATCH 0/5] Make the iommu driver no-snoop block feature
consistent
On Wed, Apr 06, 2022 at 06:52:04AM +0000, Tian, Kevin wrote:
> > From: Jason Gunthorpe <jgg@...dia.com>
> > Sent: Wednesday, April 6, 2022 12:16 AM
> >
> > PCIe defines a 'no-snoop' bit in each the TLP which is usually implemented
> > by a platform as bypassing elements in the DMA coherent CPU cache
> > hierarchy. A driver can command a device to set this bit on some of its
> > transactions as a micro-optimization.
> >
> > However, the driver is now responsible to synchronize the CPU cache with
> > the DMA that bypassed it. On x86 this is done through the wbinvd
> > instruction, and the i915 GPU driver is the only Linux DMA driver that
> > calls it.
>
> More accurately x86 supports both unprivileged clflush instructions
> to invalidate one cacheline and a privileged wbinvd instruction to
> invalidate the entire cache. Replacing 'this is done' with 'this may
> be done' is clearer.
>
> >
> > The problem comes that KVM on x86 will normally disable the wbinvd
> > instruction in the guest and render it a NOP. As the driver running in the
> > guest is not aware the wbinvd doesn't work it may still cause the device
> > to set the no-snoop bit and the platform will bypass the CPU cache.
> > Without a working wbinvd there is no way to re-synchronize the CPU cache
> > and the driver in the VM has data corruption.
> >
> > Thus, we see a general direction on x86 that the IOMMU HW is able to block
> > the no-snoop bit in the TLP. This NOP's the optimization and allows KVM to
> > to NOP the wbinvd without causing any data corruption.
> >
> > This control for Intel IOMMU was exposed by using IOMMU_CACHE and
> > IOMMU_CAP_CACHE_COHERENCY, however these two values now have
> > multiple
> > meanings and usages beyond blocking no-snoop and the whole thing has
> > become confused.
>
> Also point out your finding about AMD IOMMU?
Done, thanks
Jason
Powered by blists - more mailing lists