[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <68924d18a68d4_55f091004d@dwillia2-xfh.jf.intel.com.notmuch>
Date: Tue, 5 Aug 2025 11:27:36 -0700
From: <dan.j.williams@...el.com>
To: Jason Gunthorpe <jgg@...pe.ca>, Aneesh Kumar K.V <aneesh.kumar@...nel.org>
CC: <dan.j.williams@...el.com>, <linux-coco@...ts.linux.dev>,
<kvmarm@...ts.linux.dev>, <linux-pci@...r.kernel.org>,
<linux-kernel@...r.kernel.org>, <aik@....com>, <lukas@...ner.de>, "Samuel
Ortiz" <sameo@...osinc.com>, Xu Yilun <yilun.xu@...ux.intel.com>, "Suzuki K
Poulose" <Suzuki.Poulose@....com>, Steven Price <steven.price@....com>,
Catalin Marinas <catalin.marinas@....com>, Marc Zyngier <maz@...nel.org>,
Will Deacon <will@...nel.org>, Oliver Upton <oliver.upton@...ux.dev>
Subject: Re: [RFC PATCH v1 00/38] ARM CCA Device Assignment support
Jason Gunthorpe wrote:
> On Tue, Aug 05, 2025 at 10:37:01AM +0530, Aneesh Kumar K.V wrote:
> > > To me it is an unfortunate PCI specification wrinkle that writing to the
> > > command register drops the device from RUN to ERROR. So you can LOCK
> > > without setting BME, but then no DMA.
> >
> > This is only w.r.t clearing BME isn't ?
> >
> > According to section 11.2.6 DSM Tracking and Handling of Locked TDI Configurations
> >
> > Clearing any of the following bits causes the TDI hosted
> > by the Function to transition to ERROR:
> >
> > • Memory Space Enable
> > • Bus Master Enable
>
> Oh that's nice, yeah!
That is useful, but an unmodified PCI driver is going to make separate
calls to pci_set_master() and pci_enable_device() so it should still be
the case that those need to be trapped out of the concern that
writing back zero for a read-modify-write also trips the error state on
some device that fails the Robustness Principle.
I guess we could wait to solve that problem until the encountering the
first device that trips ERROR when writing zero to an already zeroed
bit.
> > Which implies the flow described in the cover-letter where driver enable the BME works?
> > However clearing BME may be problematic? I did have a FIXME!!/comment in [1]
> >
> > vfio_pci_core_close_device():
> >
> > #if 0
> > /*
> > * destroy vdevice which involves tsm unbind before we disable pci disable
> > * A MSE/BME clear will transition the device to error state.
> > */
> > if (core_vdev->iommufd_device)
> > iommufd_device_tombstone_vdevice(core_vdev->iommufd_device);
> > #endif
> >
> > vfio_pci_core_disable(vdev);
>
> Here is where I feel the VMM should be trapping this and NOPing it, or
> failing that the guest PCI Core should NOP it.
At this point (vfio shutdown path) the VMM is committed stopping guest
operations with the device. So ok not to not NOP in this specific path,
right?
> With the ideal version being the TSM and VMM would be able to block
> the iommu as a functional stand in for BME.
The TSM block for BME is the LOCKED or ERROR state. That would be in
conflict with the proposal that the device stays in the RUN state on
guest driver unbind.
I feel like either the device stays in RUN state and BME leaks, or the
device is returned to LOCKED on driver unbind. Otherwise a functional
stand-in for BME that also keeps the device in RUN state feels like a
TSM feature request for a "RUN but BLOCKED" state.
Powered by blists - more mailing lists