[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20220317112304.61cf8953@jacob-builder>
Date: Thu, 17 Mar 2022 11:23:04 -0700
From: Jacob Pan <jacob.jun.pan@...ux.intel.com>
To: Jason Gunthorpe <jgg@...dia.com>
Cc: iommu@...ts.linux-foundation.org,
LKML <linux-kernel@...r.kernel.org>,
Joerg Roedel <joro@...tes.org>,
Christoph Hellwig <hch@...radead.org>,
Lu Baolu <baolu.lu@...ux.intel.com>,
Jean-Philippe Brucker <jean-philippe@...aro.com>,
Jacob Pan <jacob.jun.pan@...el.com>,
Raj Ashok <ashok.raj@...el.com>,
"Kumar, Sanjay K" <sanjay.k.kumar@...el.com>,
Dave Jiang <dave.jiang@...el.com>,
Tony Luck <tony.luck@...el.com>,
"Zanussi, Tom" <tom.zanussi@...el.com>,
Dan Williams <dan.j.williams@...el.com>,
"Tian, Kevin" <kevin.tian@...el.com>, Yi Liu <yi.l.liu@...el.com>,
jacob.jun.pan@...ux.intel.com
Subject: Re: [PATCH v2 3/8] iommu/vt-d: Implement device_pasid domain attach
ops
Hi Jason,
On Thu, 17 Mar 2022 10:23:08 -0300, Jason Gunthorpe <jgg@...dia.com> wrote:
> On Wed, Mar 16, 2022 at 05:49:59PM -0700, Jacob Pan wrote:
>
> > > I would expect real applications will try to use the same PASID for
> > > the same IOVA map to optimize IOTLB caching.
> > >
> > > Is there a use case for that I'm missing?
> > >
> > Yes. it would be more efficient for PASID selective domain TLB flush.
> > But on VT-d IOTLB is also tagged by domain ID, domain flush can use DID
> > if there are many PASIDs. Not sure about other archs. Agree that
> > optimizing PASIDs for TLB flush should be a common goal.
>
> If you sort the list of (device, pasid) tuples can something like VT-d
> collapse all the same devices and just issue one DID invalidation:
>
> list_for_each()
> if (itm->device == last_invalidated_device)
> continue;
> invalidate(itm->device);
> last_invalidated_device = itm->device;
>
I assume this is for devTLB since IOMMU's IOTLB flush doesn't care about
device. I think it works for device-wide invalidation.
> While something that was per-pasid could issue per-pasid invalidations
> from the same data structure?
>
yes. we can use the same data structure for PASID selective devTLB but
list_for_each()
if (itm->pasid == pasid_to_be_invalidated;
invalidate(itm->device, pasid);
For IOMMU's IOTLB, we also have two granularities
1. domain-wide
2. pasid-wide
For #1, we just use DID to invalidate w/o traverse the list.
For #2, we just need to sanity check the pasid is indeed attached by going
through the list.
Seems to work!
> > > Otherwise your explanation is what I was imagining as well.
> > >
> > > I would also think about expanding your struct so that the device
> > > driver can track per-device per-domain data as well, that seems
> > > useful IIRC?
> > >
> > yes, at least both VT-d and FSL drivers have struct device_domain_info.
> >
> > > ie put a 'sizeof_iommu_dev_pasid_data' in the domain->ops and
> > > allocate that much memory so the driver can use the trailer space for
> > > its own purpose.
> > >
> > That sounds great to have but not sure i understood correctly how to do
> > it.
> >
> > Do you mean for each vendor driver's struct device_domain_info (or
> > equivalent), we carve out sizeof_iommu_dev_pasid_data as common data,
> > then the rest of the space is vendor specific? I don't feel I get your
> > point, could you elaborate?
>
> I've seen it done two ways..
>
> With a flex array:
>
> struct iommu_device_data {
> struct list_head list
> ioasid_t pasid;
> struct device *dev;
> [..]
> u64 device_data[];
> }
>
> struct intel_device_data {
> [..]
> }
> struct iommu_device_data *dev_data;
> struct intel_device_data *intel_data = (void *)&dev_data->device_data;
>
> Or with container of:
>
> struct iommu_device_data {
> struct list_head list
> ioasid_t pasid;
> struct device *dev;
> [..]
> }
>
> struct intel_device_data {
> struct iommu_device_data iommu; // must be first
> [...]
> }
> struct iommu_device_data *dev_data;
> struct intel_device_data *intel_data = container_of(dev_data, struct
> intel_device_data, iommu);
>
> In either case you'd add a size_t to the domain->ops specifying how
> much extra memory for the core code to allocate when it manages the
> datastructure. The first case allocates based on struct_size, the
> second case allocates what is specified.
>
> Look at INIT_RDMA_OBJ_SIZE() for some more complicated example how the
> latter can work. I like it because it has the nice container_of
> pattern in drivers, the downside is it requires a BUILD_BUG_ON to
> check that the driver ordered its struct properly.
>
> The point is to consolidate all the code for allocating and walking
> the data structure without having to force two allocations and extra
> pointer indirections on performance paths.
Make sense, very neat. Vendor driver would not need to do allocations. Let
me give that a try. Seems #2 has better type safety.
Thank you so much for the thorough explanation!
Jacob
Powered by blists - more mailing lists