[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20231213124055.GR3014157@nvidia.com>
Date: Wed, 13 Dec 2023 08:40:55 -0400
From: Jason Gunthorpe <jgg@...dia.com>
To: Nicolin Chen <nicolinc@...dia.com>
Cc: Yi Liu <yi.l.liu@...el.com>,
"Giani, Dhaval" <Dhaval.Giani@....com>,
Vasant Hegde <vasant.hegde@....com>,
Suravee Suthikulpanit <suravee.suthikulpanit@....com>,
joro@...tes.org, alex.williamson@...hat.com, kevin.tian@...el.com,
robin.murphy@....com, baolu.lu@...ux.intel.com, cohuck@...hat.com,
eric.auger@...hat.com, kvm@...r.kernel.org, mjrosato@...ux.ibm.com,
chao.p.peng@...ux.intel.com, yi.y.sun@...ux.intel.com,
peterx@...hat.com, jasowang@...hat.com,
shameerali.kolothum.thodi@...wei.com, lulu@...hat.com,
iommu@...ts.linux.dev, linux-kernel@...r.kernel.org,
linux-kselftest@...r.kernel.org, zhenzhong.duan@...el.com,
joao.m.martins@...cle.com, xin.zeng@...el.com, yan.y.zhao@...el.com
Subject: Re: [PATCH v6 0/6] iommufd: Add nesting infrastructure (part 2/2)
On Tue, Dec 12, 2023 at 12:05:41PM -0800, Nicolin Chen wrote:
> > > // iommufd_private.h
> > >
> > > enum iommufd_object_type {
> > > ...
> > > + IOMMUFD_OBJ_VIOMMU,
> > > ...
> > > };
> > >
> > > +struct iommufd_viommu {
> > > + struct iommufd_object obj;
> > > + struct iommufd_hwpt_paging *hwpt;
> > > + struct xarray devices;
> > > +};
> > >
> > > struct iommufd_hwpt_paging hwpt {
> > > ...
> > > + struct list_head viommu_list;
> > > ...
> > > };
> >
> > I'd probably first try to go backwards and link the hwpt to the
> > viommu.
>
> I think a VM should have only one hwpt_paging object while one
> or more viommu objects, so we could do only viommu->hwpt_paging
> and hwpt_paging->viommu_list. How to go backwards?
That is probably how things would work but I don't know if it makes
sense to enforce it in the kernel logic..
Point the S2 to a list of viommu objects it is linked to
> > > struct iommufd_group {
> > > ...
> > > + struct iommufd_viommu *viommu; // should we attach to viommu instead of hwpt?
> > > ...
> > > };
> >
> > No. Attach is a statement of translation so you still attach to the HWPT.
>
> OK. It's probably not necessary since we know which piommu the
> device is behind. And we only need to link viommu and piommu,
> right?
Yes
> > The second version maybe we have the xarray, or maybe we just push the
> > xarray to the eventual viommu series.
>
> I think that I still don't get the purpose of the xarray here.
> It was needed previously because a cache invalidate per hwpt
> doesn't know which device. Now IOMMUFD_DEV_INVALIDATE knows.
>
> Maybe it's related to that narrative "logically we could have
> multiple mappings per iommufd" that you mentioned above. Mind
> elaborating a bit?
>
> In my mind, viommu is allocated by VMM per piommu, by detecting
> the piommu_id via hw_info. In that case, viommu can only have
> one unique device list. If IOMMUFD_DEV_INVALIDATE passes in the
> dev_id, we don't really need a mapping of vRID-pRID in a multi-
> viommu case either? In another word, VMM already has a mapping
> from vRID to dev_id, so it could call the DEV_INVALIDATE ioctl
> in the first place?
The xarray exists to optimize the invalidation flow.
For SW you can imagine issuing an invalidation against the viommu
itself and *all* commands, be they ASID or ATC invalidations can be
processed in one shot. The xarray allows converting the vSID to pSID
to process ATC invalidations, and the viommu object forces a single
VMID to handle the ATC invalidations. If we want to do this, I don't
know.
For HW, the xarray holds the vSID to pSID mapping that must be
programmed into the HW operating the dedicated invalidation queue.
Jason
Powered by blists - more mailing lists