[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20240108135132.GI50406@nvidia.com>
Date: Mon, 8 Jan 2024 09:51:32 -0400
From: Jason Gunthorpe <jgg@...dia.com>
To: "Tian, Kevin" <kevin.tian@...el.com>
Cc: "Liu, Yi L" <yi.l.liu@...el.com>, "joro@...tes.org" <joro@...tes.org>,
"alex.williamson@...hat.com" <alex.williamson@...hat.com>,
"robin.murphy@....com" <robin.murphy@....com>,
"baolu.lu@...ux.intel.com" <baolu.lu@...ux.intel.com>,
"cohuck@...hat.com" <cohuck@...hat.com>,
"eric.auger@...hat.com" <eric.auger@...hat.com>,
"nicolinc@...dia.com" <nicolinc@...dia.com>,
"kvm@...r.kernel.org" <kvm@...r.kernel.org>,
"mjrosato@...ux.ibm.com" <mjrosato@...ux.ibm.com>,
"chao.p.peng@...ux.intel.com" <chao.p.peng@...ux.intel.com>,
"yi.y.sun@...ux.intel.com" <yi.y.sun@...ux.intel.com>,
"peterx@...hat.com" <peterx@...hat.com>,
"jasowang@...hat.com" <jasowang@...hat.com>,
"shameerali.kolothum.thodi@...wei.com" <shameerali.kolothum.thodi@...wei.com>,
"lulu@...hat.com" <lulu@...hat.com>,
"suravee.suthikulpanit@....com" <suravee.suthikulpanit@....com>,
"iommu@...ts.linux.dev" <iommu@...ts.linux.dev>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
"linux-kselftest@...r.kernel.org" <linux-kselftest@...r.kernel.org>,
"Duan, Zhenzhong" <zhenzhong.duan@...el.com>,
"joao.m.martins@...cle.com" <joao.m.martins@...cle.com>,
"Zeng, Xin" <xin.zeng@...el.com>,
"Zhao, Yan Y" <yan.y.zhao@...el.com>
Subject: Re: [PATCH v7 1/3] iommufd: Add data structure for Intel VT-d
stage-1 cache invalidation
On Mon, Jan 08, 2024 at 04:07:12AM +0000, Tian, Kevin wrote:
> > > In concept w/o vSVA it's still possible to assign sibling vdev's to
> > > a same VM as each vdev is allocated with a unique pasid to mark vRID
> > > so can be differentiated from each other in the fault/error path.
> >
> > I thought the SIOV plan was that each "vdev" ie vpci function would
> > get a slice of the pRID's PASID space statically selected at creation?
> >
> > So SVA/etc doesn't matter, you reliably get a disjoint set of pRID &
> > pPASID into each VM.
> >
> > From that view you can't identify the iommufd dev_id without knowing
> > both the pRID and pPASID which will disambiguate the different SIOV
> > iommufd dev_id instances sharing a rid.
>
> true when assigning those instances to different VMs.
>
> Here I was talking about assigning them to a same VM being a problem.
> with rid sharing plus same ENQCMD pPASID potentially used on both
> instances there'd be ambiguity in vSVA e.g. iopf to identify dev_id.
Oh you imaging sharing the pPASID if things have the same translation?
I guess I can see why, but given where things are overall I'd say just
don't do that.
Indeed we can't do that because it makes the vRID unknowable.
(again I continue to think that vt-d cache design is messed up, using
the PASID for the cache tag is a *terrible* design, and causes exactly
these kinds of problems)
> for errors related to descriptor fetch the driver can tell the command
> by looking at the head pointer of the invalidation queue.
>
> command completion is indirectly detected by inserting a wait descriptor
> as fence. completion timeout error is reported in an error register. but
> this register doesn't record pasid, nor does the command location. if there
> are multiple pending devtlb invalidation commands upon timeout
> error the spec suggests the driver to treat all of them timeout as the
> register can only record one rid.
Makes sense, or at least you have to re-issue them one by one
> this is kind of moot. If the driver submits only one command (plus wait)
> at a time it doesn't need hw's help to identify the timeout command.
> If the driver batches invalidation commands it must treat all timeout if
> an timeout error is reported.
Yes
> from this angle whether to record pasid doesn't really matter.
At least for error handling..
Jason
Powered by blists - more mailing lists