lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20220317132308.GV11336@nvidia.com>
Date:   Thu, 17 Mar 2022 10:23:08 -0300
From:   Jason Gunthorpe <jgg@...dia.com>
To:     Jacob Pan <jacob.jun.pan@...ux.intel.com>
Cc:     iommu@...ts.linux-foundation.org,
        LKML <linux-kernel@...r.kernel.org>,
        Joerg Roedel <joro@...tes.org>,
        Christoph Hellwig <hch@...radead.org>,
        Lu Baolu <baolu.lu@...ux.intel.com>,
        Jean-Philippe Brucker <jean-philippe@...aro.com>,
        Jacob Pan <jacob.jun.pan@...el.com>,
        Raj Ashok <ashok.raj@...el.com>,
        "Kumar, Sanjay K" <sanjay.k.kumar@...el.com>,
        Dave Jiang <dave.jiang@...el.com>,
        Tony Luck <tony.luck@...el.com>,
        "Zanussi, Tom" <tom.zanussi@...el.com>,
        Dan Williams <dan.j.williams@...el.com>,
        "Tian, Kevin" <kevin.tian@...el.com>, Yi Liu <yi.l.liu@...el.com>
Subject: Re: [PATCH v2 3/8] iommu/vt-d: Implement device_pasid domain attach
 ops

On Wed, Mar 16, 2022 at 05:49:59PM -0700, Jacob Pan wrote:

> > I would expect real applications will try to use the same PASID for
> > the same IOVA map to optimize IOTLB caching.
> > 
> > Is there a use case for that I'm missing?
> > 
> Yes. it would be more efficient for PASID selective domain TLB flush. But
> on VT-d IOTLB is also tagged by domain ID, domain flush can use DID if
> there are many PASIDs. Not sure about other archs. Agree that optimizing
> PASIDs for TLB flush should be a common goal.

If you sort the list of (device, pasid) tuples can something like VT-d
collapse all the same devices and just issue one DID invalidation:

 list_for_each()
    if (itm->device == last_invalidated_device)
          continue;
    invalidate(itm->device);
    last_invalidated_device = itm->device;

While something that was per-pasid could issue per-pasid invalidations
from the same data structure?

> > Otherwise your explanation is what I was imagining as well.
> > 
> > I would also think about expanding your struct so that the device
> > driver can track per-device per-domain data as well, that seems
> > useful IIRC?
> > 
> yes, at least both VT-d and FSL drivers have struct device_domain_info.
> 
> > ie put a 'sizeof_iommu_dev_pasid_data' in the domain->ops and
> > allocate that much memory so the driver can use the trailer space for
> > its own purpose.
> > 
> That sounds great to have but not sure i understood correctly how to do it.
> 
> Do you mean for each vendor driver's struct device_domain_info (or
> equivalent), we carve out sizeof_iommu_dev_pasid_data as common data, then
> the rest of the space is vendor specific? I don't feel I get your point,
> could you elaborate?

I've seen it done two ways..

With a flex array:

 struct iommu_device_data {
     struct list_head list
     ioasid_t pasid;
     struct device *dev;
     [..]
     u64 device_data[];
 }

 struct intel_device_data {
      [..]
 }
 struct iommu_device_data *dev_data;
 struct intel_device_data *intel_data = (void *)&dev_data->device_data;

Or with container of:

 struct iommu_device_data {
     struct list_head list
     ioasid_t pasid;
     struct device *dev;
     [..]
 }

 struct intel_device_data {
     struct iommu_device_data iommu; // must be first
     [...]
 }
 struct iommu_device_data *dev_data;
 struct intel_device_data *intel_data = container_of(dev_data, struct intel_device_data, iommu);

In either case you'd add a size_t to the domain->ops specifying how
much extra memory for the core code to allocate when it manages the
datastructure. The first case allocates based on struct_size, the
second case allocates what is specified.

Look at INIT_RDMA_OBJ_SIZE() for some more complicated example how the
latter can work. I like it because it has the nice container_of
pattern in drivers, the downside is it requires a BUILD_BUG_ON to
check that the driver ordered its struct properly.

The point is to consolidate all the code for allocating and walking
the data structure without having to force two allocations and extra
pointer indirections on performance paths.

Jason

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ