[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <Zrx_qO1iHwbr4ctO@google.com>
Date: Wed, 14 Aug 2024 09:58:00 +0000
From: Mostafa Saleh <smostafa@...gle.com>
To: Jason Gunthorpe <jgg@...pe.ca>
Cc: linux-kernel@...r.kernel.org, iommu@...ts.linux.dev,
linux-arm-kernel@...ts.infradead.org, will@...nel.org,
robin.murphy@....com, joro@...tes.org, nicolinc@...dia.com,
mshavit@...gle.com
Subject: Re: [PATCH 2/2] iommu/arm-smmu-v3: Report stalled S2 events
On Tue, Aug 13, 2024 at 02:51:55PM -0300, Jason Gunthorpe wrote:
> On Mon, Aug 12, 2024 at 08:52:55PM +0000, Mostafa Saleh wrote:
> > Previously, S2 stall was disabled and in case there was an event it
> > wouldn't be reported on the assumption that it's always pinned by VFIO.
> >
> > However, now since we can enable stall, devices that use S2 outside
> > VFIO should be able to report the stalls similar to S1.
> >
> > Also, to keep the old behaviour were S2 events from nested domains were
> > not reported as they are pinned (from VFIO) add a new flag to track this.
>
> I'm not entirely clear on every detail of this stall feature...
>
> But from a core perspective device fault reporting should only ever be
> turned on in the STE/CD if the attached domain->iopf_handler is not NULL.
>
> If it is NULL then any access to a non-present address should trigger
> some kind of device error failure automatically.
>
> This is new core functionality since this code would have been
> originally written. Now it is all handled transparently by the core
> code. The driver should just deliver all fault events to
> iommu_report_device_fault() and it will sort it out.
>
I agree, as there is no iopf handler in this case, we should just
report it to the iommu code and it will reject it instead of tracking this
in the driver. And as all the “enable_nesting” stuff is going away soon
anyway it’s not worth adding extra code for it. (plus we shouldn’t
really assume the intention of caller)
> > Signed-off-by: Mostafa Saleh <smostafa@...gle.com>
> > ---
> > drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 18 +++++++++++++-----
> > drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h | 2 ++
> > 2 files changed, 15 insertions(+), 5 deletions(-)
> >
> > diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> > index 8d573d9ca93c..ffa865529d73 100644
> > --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> > +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> > @@ -1733,6 +1733,7 @@ static int arm_smmu_handle_evt(struct arm_smmu_device *smmu, u64 *evt)
> > u32 sid = FIELD_GET(EVTQ_0_SID, evt[0]);
> > struct iopf_fault fault_evt = { };
> > struct iommu_fault *flt = &fault_evt.fault;
> > + struct arm_smmu_domain *smmu_domain;
> >
> > switch (FIELD_GET(EVTQ_0_ID, evt[0])) {
> > case EVT_ID_TRANSLATION_FAULT:
> > @@ -1744,10 +1745,6 @@ static int arm_smmu_handle_evt(struct arm_smmu_device *smmu, u64 *evt)
> > return -EOPNOTSUPP;
> > }
> >
> > - /* Stage-2 is always pinned at the moment */
> > - if (evt[1] & EVTQ_1_S2)
> > - return -EFAULT;
> > -
>
> This makes sense at first blush since the domain mode shouldn't define
> if events should be processed or not, and the events should be failed
> anyhow right? If someone did turn on fault reporting in the STE then
> it should always be processed to conclusion.
>
> > if (!(evt[1] & EVTQ_1_STALL))
> > return -EOPNOTSUPP;
> >
> > @@ -1782,6 +1779,15 @@ static int arm_smmu_handle_evt(struct arm_smmu_device *smmu, u64 *evt)
> > goto out_unlock;
> > }
> >
> > + /* It is guaranteed that smmu_domain exists as EVTQ_1_STALL is checked. */
> > + smmu_domain = to_smmu_domain(iommu_get_domain_for_dev(master->dev));
>
> Strongly discouraging drivers from calling iommu_get_domain_for_dev()
> in async paths like this. The locking is tricky and the core code does...
>
> > + /* nesting domain is always pinned at the moment */
> > + if (smmu_domain->enable_nesting) {
>
> This is not necessary - a nesting domain will never have an
> iopf_handler set.
>
> It immediately calls iommu_report_device_fault() which will reject it
> because of:
>
> if (!group->attach_handle->domain->iopf_handler)
> goto err_abort;
>
> Which after the rework will end up in find_fault_handler() at the top
> of the function:
>
> https://lore.kernel.org/r/ZrTNGepJXbmfuKBK@google.com
>
> So I think these parts are not necessary.
Yes, I will remove this patch and squash the removal of ignoring S2
events.
Thanks,
Mostafa
>
> Though arguably we should be rejecting domains with iopf_handler set
> in some of the attach calls..
>
> Jason
Powered by blists - more mailing lists