linux-kernel - Re: [RFC PATCHES 00/17] IOMMUFD: Deliver IO page faults to user space

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <26b97776-7ce5-51f6-77b7-0ce837aa7cca@linux.intel.com>
Date:   Wed, 28 Jun 2023 10:00:56 +0800
From:   Baolu Lu <baolu.lu@...ux.intel.com>
To:     Jason Gunthorpe <jgg@...pe.ca>
Cc:     baolu.lu@...ux.intel.com, Nicolin Chen <nicolinc@...dia.com>,
        Kevin Tian <kevin.tian@...el.com>,
        Joerg Roedel <joro@...tes.org>, Will Deacon <will@...nel.org>,
        Robin Murphy <robin.murphy@....com>,
        Jean-Philippe Brucker <jean-philippe@...aro.org>,
        Yi Liu <yi.l.liu@...el.com>,
        Jacob Pan <jacob.jun.pan@...ux.intel.com>,
        iommu@...ts.linux.dev, linux-kselftest@...r.kernel.org,
        virtualization@...ts.linux-foundation.org,
        linux-kernel@...r.kernel.org
Subject: Re: [RFC PATCHES 00/17] IOMMUFD: Deliver IO page faults to user space

On 2023/6/27 2:33, Jason Gunthorpe wrote:
> On Sun, Jun 25, 2023 at 02:30:46PM +0800, Baolu Lu wrote:
> 
>> Agreed. We should avoid workqueue in sva iopf framework. Perhaps we
>> could go ahead with below code? It will be registered to device with
>> iommu_register_device_fault_handler() in IOMMU_DEV_FEAT_IOPF enabling
>> path. Un-registering in the disable path of cause.
> 
> This maze needs to be undone as well.
> 
> It makes no sense that all the drivers are calling
> 
>   iommu_register_device_fault_handler(dev, iommu_queue_iopf, dev);
> 
> The driver should RX a PRI fault and deliver it to some core code
> function, this looks like a good start:
> 
>> static int io_pgfault_handler(struct iommu_fault *fault, void *cookie)
>> {
>>          ioasid_t pasid = fault->prm.pasid;
>>          struct device *dev = cookie;
>>          struct iommu_domain *domain;
>>
>>          if (fault->type != IOMMU_FAULT_PAGE_REQ)
>>                  return -EOPNOTSUPP;
>>
>>          if (fault->prm.flags & IOMMU_FAULT_PAGE_REQUEST_PASID_VALID)
>>                  domain = iommu_get_domain_for_dev_pasid(dev, pasid, 0);
>>          else
>>                  domain = iommu_get_domain_for_dev(dev);
>>
>>          if (!domain || !domain->iopf_handler)
>>                  return -ENODEV;
>>
>>          if (domain->type == IOMMU_DOMAIN_SVA)
>>                  return iommu_queue_iopf(fault, cookie);
>>
>>          return domain->iopf_handler(fault, dev, domain->fault_data);
> 
> Then we find the domain that owns the translation and invoke its
> domain->ops->iopf_handler()

Agreed. The iommu_register_device_fault_handler() could only be called
by the device drivers who want to handle the DMA faults and IO page
faults by themselves in any special ways.

By default, the faults should be dispatched to domain->iopf_handler in a
generic core code.

> 
> If the driver created a SVA domain then the op should point to some
> generic 'handle sva fault' function. There shouldn't be weird SVA
> stuff in the core code.
> 
> The weird SVA stuff is really just a generic per-device workqueue
> dispatcher, so if we think that is valuable then it should be
> integrated into the iommu_domain (domain->ops->use_iopf_workqueue =
> true for instance). Then it could route the fault through the
> workqueue and still invoke domain->ops->iopf_handler.
> 
> The word "SVA" should not appear in any of this.

Yes. We should make it generic. The domain->use_iopf_workqueue flag
denotes that the page faults of a fault group should be put together and
then be handled and responded in a workqueue. Otherwise, the page fault
is dispatched to domain->iopf_handler directly.

> 
> Not sure what iommu_register_device_fault_handler() has to do with all
> of this.. Setting up the dev_iommu stuff to allow for the workqueue
> should happen dynamically during domain attach, ideally in the core
> code before calling to the driver.

There are two pointers under struct dev_iommu for fault handling.

/**
  * struct dev_iommu - Collection of per-device IOMMU data
  *
  * @fault_param: IOMMU detected device fault reporting data
  * @iopf_param:  I/O Page Fault queue and data

[...]

struct dev_iommu {
         struct mutex lock;
         struct iommu_fault_param        *fault_param;
         struct iopf_device_param        *iopf_param;

My understanding is that @fault_param is a place holder for generic
things, while @iopf_param is workqueue specific. Perhaps we could make
@fault_param static and initialize it during iommu device_probe, as
IOMMU fault is generic on every device managed by an IOMMU.

@iopf_param could be allocated on demand. (perhaps renaming it to a more
meaningful one?) It happens before a domain with use_iopf_workqueue flag
set attaches to a device. iopf_param keeps alive until device_release.

> 
> Also, I can understand there is a need to turn on PRI support really
> early, and it can make sense to have some IOMMU_DEV_FEAT_IOPF/SVA to
> ask to turn it on.. But that should really only be needed if the HW
> cannot turn it on dynamically during domain attach of a PRI enabled
> domain.
> 
> It needs cleaning up..

Yes. I can put this and other cleanup things that we've discussed in a
preparation series and send it out for review after the next rc1 is
released.

> 
> Jason

Best regards,
baolu