lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20240215175534.GD1299735@ziepe.ca>
Date: Thu, 15 Feb 2024 13:55:34 -0400
From: Jason Gunthorpe <jgg@...pe.ca>
To: Lu Baolu <baolu.lu@...ux.intel.com>
Cc: Joerg Roedel <joro@...tes.org>, Will Deacon <will@...nel.org>,
	Robin Murphy <robin.murphy@....com>,
	Kevin Tian <kevin.tian@...el.com>,
	Huang Jiaqing <jiaqing.huang@...el.com>,
	Ethan Zhao <haifeng.zhao@...ux.intel.com>, iommu@...ts.linux.dev,
	linux-kernel@...r.kernel.org
Subject: Re: [PATCH 2/2] iommu/vt-d: Use device rbtree in iopf reporting path

On Thu, Feb 15, 2024 at 03:22:49PM +0800, Lu Baolu wrote:
> The existing IO page fault handler currently locates the PCI device by
> calling pci_get_domain_bus_and_slot(). This function searches the list
> of all PCI devices until the desired device is found. To improve lookup
> efficiency, a helper function named device_rbtree_find() is introduced
> to search for the device within the rbtree. Replace
> pci_get_domain_bus_and_slot() in the IO page fault handling path.
> 
> Co-developed-by: Huang Jiaqing <jiaqing.huang@...el.com>
> Signed-off-by: Huang Jiaqing <jiaqing.huang@...el.com>
> Signed-off-by: Lu Baolu <baolu.lu@...ux.intel.com>
> ---
>  drivers/iommu/intel/iommu.h |  1 +
>  drivers/iommu/intel/iommu.c | 29 +++++++++++++++++++++++++++++
>  drivers/iommu/intel/svm.c   | 14 ++++++--------
>  3 files changed, 36 insertions(+), 8 deletions(-)
> 
> diff --git a/drivers/iommu/intel/iommu.h b/drivers/iommu/intel/iommu.h
> index 54eeaa8e35a9..f13c228924f8 100644
> --- a/drivers/iommu/intel/iommu.h
> +++ b/drivers/iommu/intel/iommu.h
> @@ -1081,6 +1081,7 @@ void free_pgtable_page(void *vaddr);
>  void iommu_flush_write_buffer(struct intel_iommu *iommu);
>  struct iommu_domain *intel_nested_domain_alloc(struct iommu_domain *parent,
>  					       const struct iommu_user_data *user_data);
> +struct device *device_rbtree_find(struct intel_iommu *iommu, u16 rid);
>  
>  #ifdef CONFIG_INTEL_IOMMU_SVM
>  void intel_svm_check(struct intel_iommu *iommu);
> diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c
> index 09009d96e553..d92c680bcc96 100644
> --- a/drivers/iommu/intel/iommu.c
> +++ b/drivers/iommu/intel/iommu.c
> @@ -120,6 +120,35 @@ static int device_rid_cmp(struct rb_node *lhs, const struct rb_node *rhs)
>  	return device_rid_cmp_key(&key, rhs);
>  }
>  
> +/*
> + * Looks up an IOMMU-probed device using its source ID.
> + *
> + * If the device is found:
> + *  - Increments its reference count.
> + *  - Returns a pointer to the device.
> + *  - The caller must call put_device() after using the pointer.
> + *
> + * If the device is not found, returns NULL.
> + */
> +struct device *device_rbtree_find(struct intel_iommu *iommu, u16 rid)
> +{
> +	struct device_domain_info *info;
> +	struct device *dev = NULL;
> +	struct rb_node *node;
> +	unsigned long flags;
> +
> +	spin_lock_irqsave(&iommu->device_rbtree_lock, flags);
> +	node = rb_find(&rid, &iommu->device_rbtree, device_rid_cmp_key);
> +	if (node) {
> +		info = rb_entry(node, struct device_domain_info, node);
> +		dev = info->dev;
> +		get_device(dev);

This get_device() is a bit troubling. It eventually calls into
iommu_report_device_fault() which does:

	struct dev_iommu *param = dev->iommu;

Which is going to explode if the iomm driver release has already
happened, which is a precondition to getting to a unref'd struct
device.

The driver needs to do something to fence these events during it's
release function.

If we are already doing that then I'd suggest to drop the get_device
and add a big fat comment explaining the special rules about lifetime
that are in effect here.

Otherwise you need to do that barrier rethink the way the locking
works..

Aside from that this looks like a great improvement to me

Thanks,
Jason

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ