[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <1084959304.73550.1658244469031.JavaMail.zimbra@efficios.com>
Date: Tue, 19 Jul 2022 11:27:49 -0400 (EDT)
From: Mathieu Desnoyers <mathieu.desnoyers@...icios.com>
To: Arnd Bergmann <arnd@...db.de>
Cc: James Morse <james.morse@....com>,
Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
Catalin Marinas <catalin.marinas@....com>,
"Russell King, ARM Linux" <linux@...linux.org.uk>,
Mark Rutland <mark.rutland@....com>,
Will Deacon <will@...nel.org>, Shawn Guo <shawnguo@...nel.org>,
Sascha Hauer <s.hauer@...gutronix.de>,
Michael Jeanson <mjeanson@...icios.com>,
Linux ARM <linux-arm-kernel@...ts.infradead.org>
Subject: Re: [RFC PATCH] arm: i.MX6 Cortex-A9: Fix memory ordering
inconsistency by disabling prefetch instructions
----- On Jul 19, 2022, at 10:33 AM, Arnd Bergmann arnd@...db.de wrote:
> On Mon, Jul 18, 2022 at 4:51 PM Mathieu Desnoyers
> <mathieu.desnoyers@...icios.com> wrote:
>
>> Request for Feedback
>> ====================
>>
>> This fix targets all i.MX configurations, but it is likely too broad (or
>> too narrow). It would be great if people with access to different
>> Freescale i.MX test boards, and test boards from other vendors, could try
>> to reproduce the issue to figure out what would be the right scope for
>> this fix.
>>
>> It would also be great if people with knowledge of the ARM CPU internals
>> could help understanding whether this fix really fixes an issue between
>> prefetch and memory barriers, or just happens to hide the issue. It
>> would be good to understand whether this issue only affects PLDW or if
>> it also affects the PLD instruction.
>
> I don't have any relevant hardware at hand, but looked at this for a few
> hours today, unfortunately without any notable success. Just documenting
> what I did here:
>
> - looked at the errata lists for cortex-a9 r2, for pl310 and for
> imxq6q to see if
> anything stuck out. I assume you've already done the same, but I can confirm
> that the errata that would match the symptom are listed as fixed in r2p10
> or earlier.
Yes, I've spent some quality time reading through those errata in the past 2 weeks,
and did not find anything relevant for the r2p10.
>
> - looked at objdump output from
> linux-image-5.18.0-0.bpo.1-armmp_5.18.2-1~bpo11+1_armhf.deb
> (not the same version, but hopefully be close enough), and compared that
> to v5.18.2 built with the same config using gcc-7.5 and gcc-10.3 to
> see if I could tell what is different. The output looks very similar, though
> my own gcc-10 apparently fails to inline arch_futex_atomic_op_inuser()
> and futex_atomic_op_inuser(). This looks like something we may want
> to force-inline in principle, but it seems unrelated to the bug you found
> since the debian vmlinux has these functions inlined and I don't think
> they are actually part of the broken code path.
Indeed, those op_inuser did not appear to be used in FUTEX_WAKE, FUTEX_WAIT
AFAIR, so I don't think the delta is relevant here.
>
> - looked for other quad-core Cortex-A9 SoCs to find someone with a
> similar revision to check if they have the same bug. The closest I
> can think of is the OMAP4 that uses an A9 r1p2.
Good to know.
>
> - Looked at the disabled errata handling in arch/arm/Kconfig.
> Unfortunately a couple of the workarounds we have there are
> now always disabled because of a dependency on
> ARCH_MULTIPLATFORM. It's a long shot, but you could try
> removing the dependencies and enabling all the Cortex-A9
> fixes like ARM_ERRATA_742230, ARM_ERRATA_742231,
> ARM_ERRATA_743622, ARM_ERRATA_751472, and
> ARM_ERRATA_754327.
I already attempted this, but ended up understanding that
handling of those errata workarounds were simply moved to U-Boot,
so it can set the relevant bits in the Diagnostic Control Register
at boot-time when allowed by the current privilege level, before
loading a secure boot Linux kernel. That being said, my test
system does not use secure boot.
U-Boot 2021.01+dfsg-4 has:
/usr/share/doc/u-boot-imx/configs/config.wandboard.gz :
CONFIG_ARM_ERRATA_743622=y
CONFIG_ARM_ERRATA_751472=y
CONFIG_ARM_ERRATA_761320=y
CONFIG_ARM_ERRATA_794072=y
CONFIG_ARM_ERRATA_845369=y
About errata 742230 ("ARM errata: DMB operation may be faulty"), it only
applies to Cortex-A9 r1p0..r2p2, which explains why the wandboard U-Boot
config has it =n. Nevertheless, I attempted modifying the Linux kernel code
to explicitly change the implementation of smp_mb() from dmb to dsb, but
it did not solve the issue.
About errata 754327 ("ARM errata: no automatic Store Buffer drain"), it
applies prior to r2p0, and is enabled in my Debian kernel configuration
already, because it does not depend on !ARCH_MULTIPLATFORM. The issue
reproduces with this work-around enabled.
Thanks,
Mathieu
--
Mathieu Desnoyers
EfficiOS Inc.
http://www.efficios.com
Powered by blists - more mailing lists