[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20240221160849.GB7362@willie-the-truck>
Date: Wed, 21 Feb 2024 16:08:50 +0000
From: Will Deacon <will@...nel.org>
To: "ni.liqiang" <niliqiang.io@...il.com>
Cc: danielmentz@...gle.com, iommu@...ts.linux.dev, jin.qi@....com.cn,
joro@...tes.org, linux-arm-kernel@...ts.infradead.org,
linux-kernel@...r.kernel.org, robin.murphy@....com,
ni.liqiang@....com.cn
Subject: Re: [PATCH] drivers/iommu: Ensure that the queue base address is
successfully written during SMMU initialization.
On Wed, Feb 21, 2024 at 11:26:29PM +0800, ni.liqiang wrote:
> >> The SMMU registers are accessed using Device-nGnRE attributes. It is
> >> my understanding that, for Device-nGnRE, the Arm architecture requires
> >> that writes to the same peripheral arrive at the endpoint in program
> >> order.
> >
> > Yup, that's correct. The "nR" part means "non-Reordering", so something
> > else is going on here.
>
> Yes, the SMMU registers are accessed using Device-nGnRE attributes.
>
> One additional point to note is: in cases where there is a failure writing
> to the CMDQ base address register, the testing environment was a
> multi-die, multi-socket server. This issue has not been observed on a
> single-die server. I apologize for omitting this information in my initial
> patch submission.
Uh-oh, smells like a hardware issue ;p
I wonder if Device-nGnRnE behaves any differently?
> Over the past few days, I have referenced the kernel source code and
> ported the SMMU register initialization process. Through multiple stress
> tests, I have attempted to reproduce the CMDQ base address register write
> failure issue. The summarized results of my experiments are as follows:
> 1. When testing with one CPU core bound using taskset, the initialization
> process was executed 300,000 times without encountering the CMDQ base
> address register write failure issue. However, when not binding CPU using
> taskset, the issue was reproduced around 1,000 iterations into the test.
> 2. Without CPU binding, I inserted a memory barrier between accesses to
> the CMDQ_BASE register and CMDQEN register, similar to the modification
> made in the patch. After executing the initialization process 300,000
> times, the CMDQ base address register write failure issue did not occur.
>
> Based on these observations and joint analysis with CMN colleagues, we
> speculate that in the SMMU register initialization process, if the CPU
> core changes, and these CPUs are located on different dies, the underlying
> 4 CCG ports are utilized to perform die-to-die accesses. However, in our
> current strategy, these 4 CCG ports cannot guarantee ordering, resulting
> in the completion of CMDQEN writing before the completion of CMDQ base
> address writing.
(Disclaimer: I don't know what a CCG port is)
Hmmm. The part that doesn't make sense to me here is that migrating between
CPUs implies context-switching, and we have a DSB on that path in
__switch_to(). So why would adding barriers to the driver help? Maybe it
just changes the timing?
> From the analysis above, it seems that modifying the die-to-die access
> strategy to achieve ordering of Device-nGnRE memory might be a better
> solution compared to adding a memory barrier?
I'm not sure what you're proposing, but I don't think Linux should be
changed to accomodate this.
Will
Powered by blists - more mailing lists