lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20240813175155.GN1985367@ziepe.ca>
Date: Tue, 13 Aug 2024 14:51:55 -0300
From: Jason Gunthorpe <jgg@...pe.ca>
To: Mostafa Saleh <smostafa@...gle.com>
Cc: linux-kernel@...r.kernel.org, iommu@...ts.linux.dev,
	linux-arm-kernel@...ts.infradead.org, will@...nel.org,
	robin.murphy@....com, joro@...tes.org, nicolinc@...dia.com,
	mshavit@...gle.com
Subject: Re: [PATCH 2/2] iommu/arm-smmu-v3: Report stalled S2 events

On Mon, Aug 12, 2024 at 08:52:55PM +0000, Mostafa Saleh wrote:
> Previously, S2 stall was disabled and in case there was an event it
> wouldn't be reported on the assumption that it's always pinned  by VFIO.
> 
> However, now since we can enable stall, devices that use S2 outside
> VFIO should be able to report the stalls similar to S1.
> 
> Also, to keep the old behaviour were S2 events from nested domains were
> not reported as they are pinned (from VFIO) add a new flag to track this.

I'm not entirely clear on every detail of this stall feature...

But from a core perspective device fault reporting should only ever be
turned on in the STE/CD if the attached domain->iopf_handler is not NULL.

If it is NULL then any access to a non-present address should trigger
some kind of device error failure automatically.

This is new core functionality since this code would have been
originally written. Now it is all handled transparently by the core
code. The driver should just deliver all fault events to
iommu_report_device_fault() and it will sort it out.

> Signed-off-by: Mostafa Saleh <smostafa@...gle.com>
> ---
>  drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 18 +++++++++++++-----
>  drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h |  2 ++
>  2 files changed, 15 insertions(+), 5 deletions(-)
> 
> diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> index 8d573d9ca93c..ffa865529d73 100644
> --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> @@ -1733,6 +1733,7 @@ static int arm_smmu_handle_evt(struct arm_smmu_device *smmu, u64 *evt)
>  	u32 sid = FIELD_GET(EVTQ_0_SID, evt[0]);
>  	struct iopf_fault fault_evt = { };
>  	struct iommu_fault *flt = &fault_evt.fault;
> +	struct arm_smmu_domain *smmu_domain;
>  
>  	switch (FIELD_GET(EVTQ_0_ID, evt[0])) {
>  	case EVT_ID_TRANSLATION_FAULT:
> @@ -1744,10 +1745,6 @@ static int arm_smmu_handle_evt(struct arm_smmu_device *smmu, u64 *evt)
>  		return -EOPNOTSUPP;
>  	}
>  
> -	/* Stage-2 is always pinned at the moment */
> -	if (evt[1] & EVTQ_1_S2)
> -		return -EFAULT;
> -

This makes sense at first blush since the domain mode shouldn't define
if events should be processed or not, and the events should be failed
anyhow right? If someone did turn on fault reporting in the STE then
it should always be processed to conclusion.

>  	if (!(evt[1] & EVTQ_1_STALL))
>  		return -EOPNOTSUPP;
>  
> @@ -1782,6 +1779,15 @@ static int arm_smmu_handle_evt(struct arm_smmu_device *smmu, u64 *evt)
>  		goto out_unlock;
>  	}
>  
> +	/* It is guaranteed that smmu_domain exists as EVTQ_1_STALL is checked. */
> +	smmu_domain = to_smmu_domain(iommu_get_domain_for_dev(master->dev));

Strongly discouraging drivers from calling iommu_get_domain_for_dev()
in async paths like this. The locking is tricky and the core code does...

> +	/* nesting domain is always pinned at the moment */
> +	if (smmu_domain->enable_nesting) {

This is not necessary - a nesting domain will never have an
iopf_handler set.

It immediately calls iommu_report_device_fault() which will reject it
because of:

	if (!group->attach_handle->domain->iopf_handler)
		goto err_abort;

Which after the rework will end up in find_fault_handler() at the top
of the function:

 https://lore.kernel.org/r/ZrTNGepJXbmfuKBK@google.com

So I think these parts are not necessary.

Though arguably we should be rejecting domains with iopf_handler set
in some of the attach calls..

Jason

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ