[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20240521102123.7baaf85a.alex.williamson@redhat.com>
Date: Tue, 21 May 2024 10:21:23 -0600
From: Alex Williamson <alex.williamson@...hat.com>
To: Jason Gunthorpe <jgg@...dia.com>
Cc: "Tian, Kevin" <kevin.tian@...el.com>, "Vetter, Daniel"
<daniel.vetter@...el.com>, "Zhao, Yan Y" <yan.y.zhao@...el.com>,
"kvm@...r.kernel.org" <kvm@...r.kernel.org>, "linux-kernel@...r.kernel.org"
<linux-kernel@...r.kernel.org>, "x86@...nel.org" <x86@...nel.org>,
"iommu@...ts.linux.dev" <iommu@...ts.linux.dev>, "pbonzini@...hat.com"
<pbonzini@...hat.com>, "seanjc@...gle.com" <seanjc@...gle.com>,
"dave.hansen@...ux.intel.com" <dave.hansen@...ux.intel.com>,
"luto@...nel.org" <luto@...nel.org>, "peterz@...radead.org"
<peterz@...radead.org>, "tglx@...utronix.de" <tglx@...utronix.de>,
"mingo@...hat.com" <mingo@...hat.com>, "bp@...en8.de" <bp@...en8.de>,
"hpa@...or.com" <hpa@...or.com>, "corbet@....net" <corbet@....net>,
"joro@...tes.org" <joro@...tes.org>, "will@...nel.org" <will@...nel.org>,
"robin.murphy@....com" <robin.murphy@....com>, "baolu.lu@...ux.intel.com"
<baolu.lu@...ux.intel.com>, "Liu, Yi L" <yi.l.liu@...el.com>
Subject: Re: [PATCH 4/5] vfio/type1: Flush CPU caches on DMA pages in
non-coherent domains
On Tue, 21 May 2024 13:07:14 -0300
Jason Gunthorpe <jgg@...dia.com> wrote:
> On Mon, May 20, 2024 at 02:52:43AM +0000, Tian, Kevin wrote:
> > +Daniel
> >
> > > From: Jason Gunthorpe <jgg@...dia.com>
> > > Sent: Saturday, May 18, 2024 1:11 AM
> > >
> > > On Thu, May 16, 2024 at 02:31:59PM -0600, Alex Williamson wrote:
> > >
> > > > Yes, exactly. Zero'ing the page would obviously reestablish the
> > > > coherency, but the page could be reallocated without being zero'd and as
> > > > you describe the owner of that page could then get inconsistent
> > > > results.
> > >
> > > I think if we care about the performance of this stuff enough to try
> > > and remove flushes we'd be better off figuring out how to disable no
> > > snoop in PCI config space and trust the device not to use it and avoid
> > > these flushes.
> > >
> > > iommu enforcement is nice, but at least ARM has been assuming that the
> > > PCI config space bit is sufficient.
> > >
> > > Intel/AMD are probably fine here as they will only flush for weird GPU
> > > cases, but I expect ARM is going to be unhappy.
> > >
> >
> > My impression was that Intel GPU is not usable w/o non-coherent DMA,
> > but I don't remember whether it's unusable being a functional breakage
> > or a user experience breakage. e.g. I vaguely recalled that the display
> > engine cannot afford high resolution/high refresh rate using the snoop
> > way so the IOMMU dedicated for the GPU doesn't implement the force
> > snoop capability.
> >
> > Daniel, can you help explain the behavior of Intel GPU in case nosnoop
> > is disabled in the PCI config space?
> >
> > Overall it sounds that we are talking about different requirements. For
> > Intel GPU nosnoop is a must but it is not currently done securely so we
> > need add proper flush to fix it, while for ARM looks you don't have a
> > case which relies on nosnoop so finding a way to disable it is more
> > straightforward?
>
> Intel GPU weirdness should not leak into making other devices
> insecure/slow. If necessary Intel GPU only should get some variant
> override to keep no snoop working.
>
> It would make alot of good sense if VFIO made the default to disable
> no-snoop via the config space.
We can certainly virtualize the config space no-snoop enable bit, but
I'm not sure what it actually accomplishes. We'd then be relying on
the device to honor the bit and not have any backdoors to twiddle the
bit otherwise (where we know that GPUs often have multiple paths to get
to config space). We also then have the question of does the device
function correctly if we disable no-snoop. The more secure approach
might be that we need to do these cache flushes for any IOMMU that
doesn't maintain coherency, even for no-snoop transactions. Thanks,
Alex
Powered by blists - more mailing lists