[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <c2f6163e-47f0-4dce-b077-7751816be62f@linux.intel.com>
Date: Mon, 29 Jul 2024 13:29:13 +0800
From: Baolu Lu <baolu.lu@...ux.intel.com>
To: Will Deacon <will@...nel.org>, Kunkun Jiang <jiangkunkun@...wei.com>
Cc: baolu.lu@...ux.intel.com, Robin Murphy <robin.murphy@....com>,
Joerg Roedel <joro@...tes.org>, Jason Gunthorpe <jgg@...pe.ca>,
Nicolin Chen <nicolinc@...dia.com>, Michael Shavit <mshavit@...gle.com>,
Mostafa Saleh <smostafa@...gle.com>,
"moderated list:ARM SMMU DRIVERS" <linux-arm-kernel@...ts.infradead.org>,
iommu@...ts.linux.dev, linux-kernel@...r.kernel.org,
wanghaibin.wang@...wei.com, yuzenghui@...wei.com, tangnianyao@...wei.com
Subject: Re: [bug report] iommu/arm-smmu-v3: Event cannot be printed in some
scenarios
On 2024/7/24 18:24, Will Deacon wrote:
> On Wed, Jul 24, 2024 at 05:22:59PM +0800, Kunkun Jiang wrote:
>> On 2024/7/24 9:42, Kunkun Jiang wrote:
>>> drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
>>> 1797 while (!queue_remove_raw(q, evt)) {
>>> 1798 u8 id = FIELD_GET(EVTQ_0_ID, evt[0]);
>>> 1799
>>> 1800 ret = arm_smmu_handle_evt(smmu, evt);
>>> 1801 if (!ret || !__ratelimit(&rs))
>>> 1802 continue;
>>> 1803
>>> 1804 dev_info(smmu->dev, "event 0x%02x
>>> received:\n", id);
>>> 1805 for (i = 0; i < ARRAY_SIZE(evt); ++i)
>>> 1806 dev_info(smmu->dev, "\t0x%016llx\n",
>>> 1807 (unsigned long
>>> long)evt[i]);
>>> 1808
>>> 1809 cond_resched();
>>> 1810 }
>>>
>>> The smmu-v3 driver cannot print event information when "ret" is 0.
>>> Unfortunately due to commit 3dfa64aecbaf
>>> ("iommu: Make iommu_report_device_fault() return void"), the default
>>> return value in arm_smmu_handle_evt() is 0. Maybe a trace should
>>> be added here?
>> Additional explanation. Background introduction:
>> 1.A device(VF) is passthrough(VFIO-PCI) to a VM.
>> 2.The SMMU has the stall feature.
>> 3.Modified guest device driver to generate an event.
>>
>> This event handling process is as follows:
>> arm_smmu_evtq_thread
>> ret = arm_smmu_handle_evt
>> iommu_report_device_fault
>> iopf_param = iopf_get_dev_fault_param(dev);
>> // iopf is not enabled.
>> // No RESUME will be sent!
>> if (WARN_ON(!iopf_param))
>> return;
>> if (!ret || !__ratelimit(&rs))
>> continue;
>>
>> In this scenario, the io page-fault capability is not enabled.
>> There are two problems here:
>> 1. The event information is not printed.
>> 2. The entire device(PF level) is stalled,not just the current
>> VF. This affects other normal VFs.
> Oh, so that stall is probably also due to b554e396e51c ("iommu: Make
> iopf_group_response() return void"). I agree that we need a way to
> propagate error handling back to the driver in the case that
> 'iopf_param' is NULL, otherwise we're making the unexpected fault
> considerably more problematic than it needs to be.
>
> Lu -- can we add the -ENODEV return back in the case that
> iommu_report_device_fault() doesn't even find a 'iommu_fault_param' for
> the device?
Yes, of course. The commit b554e396e51c was added to consolidate the
drivers' auto response code in the core with the assumption that driver
only needs to call iommu_report_device_fault() for reporting an iopf.
Thanks,
baolu
Powered by blists - more mailing lists