linux-kernel - Re: [PATCH 2/2] iommu/vt-d: Use device rbtree in iopf reporting path

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <2049ec79-b081-4ae2-80d2-50899d461339@linux.intel.com>
Date: Wed, 21 Feb 2024 15:04:11 +0800
From: Ethan Zhao <haifeng.zhao@...ux.intel.com>
To: Jason Gunthorpe <jgg@...pe.ca>, Lu Baolu <baolu.lu@...ux.intel.com>
Cc: Joerg Roedel <joro@...tes.org>, Will Deacon <will@...nel.org>,
 Robin Murphy <robin.murphy@....com>, Kevin Tian <kevin.tian@...el.com>,
 Huang Jiaqing <jiaqing.huang@...el.com>, iommu@...ts.linux.dev,
 linux-kernel@...r.kernel.org
Subject: Re: [PATCH 2/2] iommu/vt-d: Use device rbtree in iopf reporting path

On 2/16/2024 1:55 AM, Jason Gunthorpe wrote:
> On Thu, Feb 15, 2024 at 03:22:49PM +0800, Lu Baolu wrote:
>> The existing IO page fault handler currently locates the PCI device by
>> calling pci_get_domain_bus_and_slot(). This function searches the list
>> of all PCI devices until the desired device is found. To improve lookup
>> efficiency, a helper function named device_rbtree_find() is introduced
>> to search for the device within the rbtree. Replace
>> pci_get_domain_bus_and_slot() in the IO page fault handling path.
>>
>> Co-developed-by: Huang Jiaqing <jiaqing.huang@...el.com>
>> Signed-off-by: Huang Jiaqing <jiaqing.huang@...el.com>
>> Signed-off-by: Lu Baolu <baolu.lu@...ux.intel.com>
>> ---
>>   drivers/iommu/intel/iommu.h |  1 +
>>   drivers/iommu/intel/iommu.c | 29 +++++++++++++++++++++++++++++
>>   drivers/iommu/intel/svm.c   | 14 ++++++--------
>>   3 files changed, 36 insertions(+), 8 deletions(-)
>>
>> diff --git a/drivers/iommu/intel/iommu.h b/drivers/iommu/intel/iommu.h
>> index 54eeaa8e35a9..f13c228924f8 100644
>> --- a/drivers/iommu/intel/iommu.h
>> +++ b/drivers/iommu/intel/iommu.h
>> @@ -1081,6 +1081,7 @@ void free_pgtable_page(void *vaddr);
>>   void iommu_flush_write_buffer(struct intel_iommu *iommu);
>>   struct iommu_domain *intel_nested_domain_alloc(struct iommu_domain *parent,
>>   					       const struct iommu_user_data *user_data);
>> +struct device *device_rbtree_find(struct intel_iommu *iommu, u16 rid);
>>   
>>   #ifdef CONFIG_INTEL_IOMMU_SVM
>>   void intel_svm_check(struct intel_iommu *iommu);
>> diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c
>> index 09009d96e553..d92c680bcc96 100644
>> --- a/drivers/iommu/intel/iommu.c
>> +++ b/drivers/iommu/intel/iommu.c
>> @@ -120,6 +120,35 @@ static int device_rid_cmp(struct rb_node *lhs, const struct rb_node *rhs)
>>   	return device_rid_cmp_key(&key, rhs);
>>   }
>>   
>> +/*
>> + * Looks up an IOMMU-probed device using its source ID.
>> + *
>> + * If the device is found:
>> + *  - Increments its reference count.
>> + *  - Returns a pointer to the device.
>> + *  - The caller must call put_device() after using the pointer.
>> + *
>> + * If the device is not found, returns NULL.
>> + */
>> +struct device *device_rbtree_find(struct intel_iommu *iommu, u16 rid)
>> +{
>> +	struct device_domain_info *info;
>> +	struct device *dev = NULL;
>> +	struct rb_node *node;
>> +	unsigned long flags;
>> +
>> +	spin_lock_irqsave(&iommu->device_rbtree_lock, flags);
>> +	node = rb_find(&rid, &iommu->device_rbtree, device_rid_cmp_key);
>> +	if (node) {
>> +		info = rb_entry(node, struct device_domain_info, node);
>> +		dev = info->dev;
>> +		get_device(dev);
> This get_device() is a bit troubling. It eventually calls into
> iommu_report_device_fault() which does:
>
> 	struct dev_iommu *param = dev->iommu;

So far no protection to dev->iommu structure access in generic
iommu layer between different threads, such as hot removal interrupt &
iopf handling thread, so we should enhance that in generic iommu code ?

Thanks,
Ethan

>
> Which is going to explode if the iomm driver release has already
> happened, which is a precondition to getting to a unref'd struct
> device.
>
> The driver needs to do something to fence these events during it's
> release function.
>
> If we are already doing that then I'd suggest to drop the get_device
> and add a big fat comment explaining the special rules about lifetime
> that are in effect here.
>
> Otherwise you need to do that barrier rethink the way the locking
> works..
>
> Aside from that this looks like a great improvement to me
>
> Thanks,
> Jason
>