lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Thu, 11 Nov 2021 17:23:40 +0000
From:   Daniel Thompson <daniel.thompson@...aro.org>
To:     laurentiu.tudor@....com
Cc:     gregkh@...uxfoundation.org, linux-kernel@...r.kernel.org,
        diana.craciun@....com, ioana.ciornei@....com, jon@...id-run.com,
        leoyang.li@....com
Subject: Re: [PATCH 2/8] bus: fsl-mc: handle DMA config deferral in ACPI case

Hi Laurentiu

On Thu, Jul 15, 2021 at 05:07:12PM +0300, laurentiu.tudor@....com wrote:
> From: Laurentiu Tudor <laurentiu.tudor@....com>
> 
> ACPI DMA configure API may return a defer status code, so handle it.
> On top of this, move the MC firmware resume after the DMA setup
> is completed to avoid crashing due to DMA setup not being done yet or
> being deferred.
> 
> Signed-off-by: Laurentiu Tudor <laurentiu.tudor@....com>

I saw regressions on my Honeycomb LX2 (NXP LX2060A) when I switched to
v5.15. It seems like it results in so many sMMU errors that the system
cannot function correctly (it's only about a 75% chance the system will
boot to GUI and even if it does boot successfully the system will hang
up soon after).

Bisect took me up a couple of blind alleys (mostly due to unrelated boot
problems in v5.14-rc2) by eventually led me to this patch as the cause.
Applying/unapplying this patch to a v5.14-rc3 tree will provoke/fix the
problem and reverting it against v5.15 also resolves the problem.

Is there some specific firmware version required for this patch to work
correctly?


Daniel.


PS: Below is the revert I applied to the v5.15 kernel (after
    a fairly simple merge conflict fix)

>From 4162b64e4f361a6a773e065b592dbc5493202524 Mon Sep 17 00:00:00 2001
From: Daniel Thompson <daniel.thompson@...aro.org>
Date: Thu, 11 Nov 2021 16:50:25 +0000
Subject: [PATCH] Revert "bus: fsl-mc: handle DMA config deferral in ACPI case"

This reverts commit d31e7fe20a2251f87adc6ecefbdaf25e6961ce74 because
it was causing regressions on my Honeycomb LX2 (NXP LX2060A).

All kernels where the problem manifests (as either a boot hang or a desktop
hang) issue the following messages in vast number:

~~~
arm-smmu arm-smmu.0.auto: Unhandled context fault: fsr=0x402, iova=0x23e0000100, fsynr=0x20040, cbfrsynra=0x4000, cb=0
arm-smmu arm-smmu.0.auto: Unhandled context fault: fsr=0x402, iova=0x23e0000100, fsynr=0x20040, cbfrsynra=0x4000, cb=0
arm-smmu arm-smmu.0.auto: Unhandled context fault: fsr=0x402, iova=0x23e0000100, fsynr=0x20040, cbfrsynra=0x4000, cb=0
arm-smmu arm-smmu.0.auto: Unhandled context fault: fsr=0x402, iova=0x23e0000100, fsynr=0x20040, cbfrsynra=0x4000, cb=0
arm-smmu arm-smmu.0.auto: Unhandled context fault: fsr=0x402, iova=0x23e0000100, fsynr=0x20040, cbfrsynra=0x4000, cb=0
arm-smmu arm-smmu.0.auto: Unhandled context fault: fsr=0x402, iova=0x23e0000100, fsynr=0x20040, cbfrsynra=0x4000, cb=0
arm-smmu arm-smmu.0.auto: Unhandled context fault: fsr=0x402, iova=0x23e0000100, fsynr=0x20040, cbfrsynra=0x4000, cb=0
arm-smmu arm-smmu.0.auto: Unhandled context fault: fsr=0x402, iova=0x23e0000100, fsynr=0x20040, cbfrsynra=0x4000, cb=0
arm-smmu arm-smmu.0.auto: Unhandled context fault: fsr=0x402, iova=0x23e0000100, fsynr=0x20040, cbfrsynra=0x4000, cb=0
arm-smmu arm-smmu.0.auto: Unhandled context fault: fsr=0x402, iova=0x23e0000100, fsynr=0x20040, cbfrsynra=0x4000, cb=0
arm_smmu_context_fault: 1697259 callbacks suppressed
~~~

Signed-off-by: Daniel Thompson <daniel.thompson@...aro.org>
---
 drivers/bus/fsl-mc/fsl-mc-bus.c | 26 ++++++++++++--------------
 1 file changed, 12 insertions(+), 14 deletions(-)

diff --git a/drivers/bus/fsl-mc/fsl-mc-bus.c b/drivers/bus/fsl-mc/fsl-mc-bus.c
index 8fd4a356a86e..429bacc7de20 100644
--- a/drivers/bus/fsl-mc/fsl-mc-bus.c
+++ b/drivers/bus/fsl-mc/fsl-mc-bus.c
@@ -1130,6 +1130,18 @@ static int fsl_mc_bus_probe(struct platform_device *pdev)
 	}
 
 	if (mc->fsl_mc_regs) {
+		/*
+		 * Some bootloaders pause the MC firmware before booting the
+		 * kernel so that MC will not cause faults as soon as the
+		 * SMMU probes due to the fact that there's no configuration
+		 * in place for MC.
+		 * At this point MC should have all its SMMU setup done so make
+		 * sure it is resumed.
+		 */
+		writel(readl(mc->fsl_mc_regs + FSL_MC_GCR1) &
+			     (~(GCR1_P1_STOP | GCR1_P2_STOP)),
+		       mc->fsl_mc_regs + FSL_MC_GCR1);
+
 		if (IS_ENABLED(CONFIG_ACPI) && !dev_of_node(&pdev->dev)) {
 			mc_stream_id = readl(mc->fsl_mc_regs + FSL_MC_FAPR);
 			/*
@@ -1143,25 +1155,11 @@ static int fsl_mc_bus_probe(struct platform_device *pdev)
 			error = acpi_dma_configure_id(&pdev->dev,
 						      DEV_DMA_COHERENT,
 						      &mc_stream_id);
-			if (error == -EPROBE_DEFER)
-				return error;
 			if (error)
 				dev_warn(&pdev->dev,
 					 "failed to configure dma: %d.\n",
 					 error);
 		}
-
-		/*
-		 * Some bootloaders pause the MC firmware before booting the
-		 * kernel so that MC will not cause faults as soon as the
-		 * SMMU probes due to the fact that there's no configuration
-		 * in place for MC.
-		 * At this point MC should have all its SMMU setup done so make
-		 * sure it is resumed.
-		 */
-		writel(readl(mc->fsl_mc_regs + FSL_MC_GCR1) &
-			     (~(GCR1_P1_STOP | GCR1_P2_STOP)),
-		       mc->fsl_mc_regs + FSL_MC_GCR1);
 	}
 
 	/*
-- 
2.33.0

Powered by blists - more mailing lists