[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <7d5a8b86-6f0d-50ef-1b2f-9907e447c9fc@huawei.com>
Date: Wed, 24 Jul 2024 17:22:59 +0800
From: Kunkun Jiang <jiangkunkun@...wei.com>
To: Lu Baolu <baolu.lu@...ux.intel.com>, Will Deacon <will@...nel.org>, Robin
Murphy <robin.murphy@....com>, Joerg Roedel <joro@...tes.org>, Jason
Gunthorpe <jgg@...pe.ca>, Nicolin Chen <nicolinc@...dia.com>, Michael Shavit
<mshavit@...gle.com>, Mostafa Saleh <smostafa@...gle.com>
CC: "moderated list:ARM SMMU DRIVERS" <linux-arm-kernel@...ts.infradead.org>,
<iommu@...ts.linux.dev>, <linux-kernel@...r.kernel.org>,
<wanghaibin.wang@...wei.com>, <yuzenghui@...wei.com>,
<tangnianyao@...wei.com>
Subject: Re: [bug report] iommu/arm-smmu-v3: Event cannot be printed in some
scenarios
Hi all,
On 2024/7/24 9:42, Kunkun Jiang wrote:
> Hi all,
>
> drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> 1797 while (!queue_remove_raw(q, evt)) {
> 1798 u8 id = FIELD_GET(EVTQ_0_ID, evt[0]);
> 1799
> 1800 ret = arm_smmu_handle_evt(smmu, evt);
> 1801 if (!ret || !__ratelimit(&rs))
> 1802 continue;
> 1803
> 1804 dev_info(smmu->dev, "event 0x%02x
> received:\n", id);
> 1805 for (i = 0; i < ARRAY_SIZE(evt); ++i)
> 1806 dev_info(smmu->dev, "\t0x%016llx\n",
> 1807 (unsigned long
> long)evt[i]);
> 1808
> 1809 cond_resched();
> 1810 }
>
> The smmu-v3 driver cannot print event information when "ret" is 0.
> Unfortunately due to commit 3dfa64aecbaf
> ("iommu: Make iommu_report_device_fault() return void"), the default
> return value in arm_smmu_handle_evt() is 0. Maybe a trace should
> be added here?
Additional explanation. Background introduction:
1.A device(VF) is passthrough(VFIO-PCI) to a VM.
2.The SMMU has the stall feature.
3.Modified guest device driver to generate an event.
This event handling process is as follows:
arm_smmu_evtq_thread
ret = arm_smmu_handle_evt
iommu_report_device_fault
iopf_param = iopf_get_dev_fault_param(dev);
// iopf is not enabled.
// No RESUME will be sent!
if (WARN_ON(!iopf_param))
return;
if (!ret || !__ratelimit(&rs))
continue;
In this scenario, the io page-fault capability is not enabled.
There are two problems here:
1. The event information is not printed.
2. The entire device(PF level) is stalled,not just the current
VF. This affects other normal VFs.
In addition, the same problems exist in the bare-metal scenario.
Thanks,
Kunkun Jiang
Powered by blists - more mailing lists