[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <59998dcc-9452-4efd-be69-d95754217633@linux.intel.com>
Date: Tue, 18 Feb 2025 10:57:30 +0800
From: Baolu Lu <baolu.lu@...ux.intel.com>
To: Zhangfei Gao <zhangfei.gao@...aro.org>
Cc: Jason Gunthorpe <jgg@...pe.ca>, Joerg Roedel <joro@...tes.org>,
Will Deacon <will@...nel.org>, Robin Murphy <robin.murphy@....com>,
Kevin Tian <kevin.tian@...el.com>, Fenghua Yu <fenghua.yu@...el.com>,
Dave Jiang <dave.jiang@...el.com>, Vinod Koul <vkoul@...nel.org>,
Zhou Wang <wangzhou1@...ilicon.com>, iommu@...ts.linux.dev,
linux-kernel@...r.kernel.org,
Shameerali Kolothum Thodi <shameerali.kolothum.thodi@...wei.com>
Subject: Re: [PATCH 00/12] iommu: Remove IOMMU_DEV_FEAT_SVA/_IOPF
On 2/15/25 19:35, Zhangfei Gao wrote:
> On Sat, 15 Feb 2025 at 18:09, Baolu Lu<baolu.lu@...ux.intel.com> wrote:
>> On 2/15/25 16:11, Zhangfei Gao wrote:
>>> It does not relate to multi devices, one device also happens when user
>>> page fault triggers.
>>>
>>> iopf_queue_remove_device is called.
>>> rcu_assign_pointer(param->fault_param, NULL);
>>>
>>> call trace
>>> [ 304.961312] Call trace:
>>> [ 304.961314] show_stack+0x20/0x38 (C)
>>> [ 304.961319] dump_stack_lvl+0xc0/0xd0
>>> [ 304.961324] dump_stack+0x18/0x28
>>> [ 304.961327] iopf_queue_remove_device+0xb0/0x1f0
>>> [ 304.961331] arm_smmu_remove_master_domain+0x204/0x250
>>> [ 304.961336] arm_smmu_attach_commit+0x64/0x100
>>> [ 304.961338] arm_smmu_attach_dev_nested+0x104/0x1a8
>>> [ 304.961340] __iommu_attach_device+0x2c/0x110
>>> [ 304.961343] __iommu_device_set_domain.isra.0+0x78/0xe0
>>> [ 304.961345] __iommu_group_set_domain_internal+0x78/0x160
>>> [ 304.961347] iommu_replace_group_handle+0x9c/0x150
>>> [ 304.961350] iommufd_fault_domain_replace_dev+0x88/0x120
>>> [ 304.961353] iommufd_device_do_replace+0x190/0x3c0
>>> [ 304.961355] iommufd_device_change_pt+0x270/0x688
>>> [ 304.961357] iommufd_device_replace+0x20/0x38
>>> [ 304.961359] vfio_iommufd_physical_attach_ioas+0x30/0x78
>>> [ 304.961363] vfio_df_ioctl_attach_pt+0xa8/0x188
>>> [ 304.961366] vfio_device_fops_unl_ioctl+0x310/0x990
>>>
>>>
>>> When page fault triggers:
>>>
>>> [ 1016.383578] ------------[ cut here ]-----------
>>> [ 1016.388184] WARNING: CPU: 35 PID: 717 at
>>> drivers/iommu/io-pgfault.c:231 iommu_report_device_fault+0x2c8/0x470
>> It's likely that iopf_queue_add_device() was not called for this device.
> iopf_queue_add_device is called, but quickly iopf_queue_remove_device
> is called during guest bootup.
> Then fault_param is set to NULL.
>
> arm_smmu_attach_commit
> arm_smmu_remove_master_domain
> // newly added in the first patch
> if (master_domain) {
> if (master_domain->using_iopf)
It seems the above check is incorrect. We only need to disable iopf when
an iopf-capable domain is about to be removed. Will the following
additional change make any difference?
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
index 28e67a9e3861..9b9ef738d070 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
@@ -2822,7 +2822,7 @@ static void arm_smmu_remove_master_domain(struct
arm_smmu_master *master,
spin_unlock_irqrestore(&smmu_domain->devices_lock, flags);
if (master_domain) {
- if (master_domain->using_iopf)
+ if (domain->iopf_handler)
arm_smmu_disable_iopf(master);
kfree(master_domain);
}
> arm_smmu_disable_iopf(master); ->
> iopf_queue_remove_device
> kfree(master_domain);
> }
>
> As a comparison, without this patchset, only iopf_queue_add_device is
> called, not call iopf_queue_remove_device
Thanks,
baolu
Powered by blists - more mailing lists