[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <6892562356e53_55f0910010@dwillia2-xfh.jf.intel.com.notmuch>
Date: Tue, 5 Aug 2025 12:06:11 -0700
From: <dan.j.williams@...el.com>
To: Jason Gunthorpe <jgg@...pe.ca>, <dan.j.williams@...el.com>
CC: Aneesh Kumar K.V <aneesh.kumar@...nel.org>, <linux-coco@...ts.linux.dev>,
<kvmarm@...ts.linux.dev>, <linux-pci@...r.kernel.org>,
<linux-kernel@...r.kernel.org>, <aik@....com>, <lukas@...ner.de>, "Samuel
Ortiz" <sameo@...osinc.com>, Xu Yilun <yilun.xu@...ux.intel.com>, "Suzuki K
Poulose" <Suzuki.Poulose@....com>, Steven Price <steven.price@....com>,
Catalin Marinas <catalin.marinas@....com>, Marc Zyngier <maz@...nel.org>,
Will Deacon <will@...nel.org>, Oliver Upton <oliver.upton@...ux.dev>
Subject: Re: [RFC PATCH v1 00/38] ARM CCA Device Assignment support
Jason Gunthorpe wrote:
> On Tue, Aug 05, 2025 at 11:27:36AM -0700, dan.j.williams@...el.com wrote:
> > > > Clearing any of the following bits causes the TDI hosted
> > > > by the Function to transition to ERROR:
> > > >
> > > > • Memory Space Enable
> > > > • Bus Master Enable
> > >
> > > Oh that's nice, yeah!
> >
> > That is useful, but an unmodified PCI driver is going to make separate
> > calls to pci_set_master() and pci_enable_device() so it should still be
> > the case that those need to be trapped out of the concern that
> > writing back zero for a read-modify-write also trips the error state on
> > some device that fails the Robustness Principle.
>
> I hope we don't RMW BME and MSE in some weird way like that :(
Yeah, I would like to say, "device, you get to keep the pieces if you
transition to ERROR state on re-writing on already zeroed-bit."
> > > Here is where I feel the VMM should be trapping this and NOPing it, or
> > > failing that the guest PCI Core should NOP it.
> >
> > At this point (vfio shutdown path) the VMM is committed stopping guest
> > operations with the device. So ok not to not NOP in this specific path,
> > right?
>
> What I said in my other mail was the the T=1 state should have nothing
> to do with driver binding.
Guest driver unbind, agree.
> So unbinding vfio should leave the device in the RUN state just fine.
Perhaps my vfio inexperience is showing, but at the point where the VMM
is unbinding vfio it is committed to destroying the guest's assigned
device context, no? So should that not be the point where continuing to
maintain the RUN state ends?
> > > With the ideal version being the TSM and VMM would be able to block
> > > the iommu as a functional stand in for BME.
> >
> > The TSM block for BME is the LOCKED or ERROR state. That would be in
> > conflict with the proposal that the device stays in the RUN state on
> > guest driver unbind.
>
> This is a different thing. Leaving RUN says the OS (especially
> userspace) does not trust the device.
>
> Disabling DMA, on explict trusted request from the cVM, is entirely
> fine to do inside the T=1 state. PCI made it so the only way to do
> this is with the IOMMU, oh well, so be it.
>
> > I feel like either the device stays in RUN state and BME leaks, or the
> > device is returned to LOCKED on driver unbind.
>
> Stay in RUN is my vote. I can't really defend the other choice from a
> linux driver model perspective.
>
> > Otherwise a functional stand-in for BME that also keeps the device
> > in RUN state feels like a TSM feature request for a "RUN but
> > BLOCKED" state.
>
> Yes, and probably not necessary, more of a defence against bugs in
> depth kind of request. For Linux we would like it if the device can be
> in RUN and have DMA blocked off during all times when no driver is
> attached.
Ok, defense in depth, but in the meantime rely on unbound driver == DMA
unmapped and device should be quiescent. Combine that with the fact that
userspace PCI drivers should be disabled in cVMs should mean that guest
can expect that an unbound TDI in the RUN state will remain quiet.
Powered by blists - more mailing lists