lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <086f60d6ef4395db5da7ee22c4f352d5c901d396.camel@mengyan1223.wang>
Date:   Mon, 30 Aug 2021 20:28:09 +0800
From:   Xi Ruoyao <xry111@...gyan1223.wang>
To:     Jiaxun Yang <jiaxun.yang@...goat.com>, linux-mips@...r.kernel.org
Cc:     Thomas Bogendoerfer <tsbogend@...ha.franken.de>,
        linux-kernel@...r.kernel.org, Huacai Chen <chenhuacai@...nel.org>
Subject: Re: [PATCH] mips: remove reference to "newer Loongson-3"

On Mon, 2021-08-30 at 10:32 +0800, Jiaxun Yang wrote:
> 
> 在 2021/8/29 20:49, Xi Ruoyao 写道:
> > Newest Loongson-3 processors have moved to use LoongArch
> > architecture.
> > Sadly, the LL/SC issue is still existing on both latest Loongson-3
> > processors using MIPS64 (Loongson-3A4000) and LoongArch
> > (Loongson-3A5000).
> LLSC is fixed on Loongson-3A4000 as per CPUCFG report.

If I don't enable LL/SC fix, GCC libgomp tests fail on both 3A4000 and
3A5000 (using github.com/loongson/gcc for the latter) with "invalid
access to 0x00000049" or "0x00000005".  This is a race condition: it
does not happen at all with OMP_NUM_THREADS=1, happens with about 10%
possibility with OMP_NUM_THREADS=2, and about 90% possibility with
OMP_NUM_THREAD=4 (on 3A5000, on 3A4000 the possibility is lower).

My investigation suggests this means a GCC instrinic,
__atomic_compare_and_exchange_n is not really atomic as it should be.

If this is not a hardware issue in the GS464V/LA464 uarch, then it will
be very low-possibility coincidence: two unrelated code-generation bugs
for __atomic_compare_and_exchange_n (LA port has borrowed some code from
MIPS port, but the instrinics are of course newly coded).  And I've
inspected libgomp & gcc code about __atomic_compare_and_exchange_n
carefully, nothing wrong spotted except LoongArch GCC supports "-mfix-
loongson3-llsc" which adds a "dbar 0" (like "sync" on MIPS) instruction
after SC (only for instrinics).  Enabling this fixes the libgomp
failures. Likewisely, "-Wa,-mfix-loongson3-llsc" fixes it on 3A4000.

libgomp code has been verified with thread sanitizer on other
architectures (unfortunately libtsan is not available on MIPS or
LoongArch yet), so it's very unlikely to be a coding error leading to
the race.

And LL/SC fix is still in Huacai's 3A5000 kernel.  In a mail on linux-
arch Huacai said it's "not so easy to be fixed".

Or these are two different erratas and I misunderstand them as the same
one?

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ