lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20170731122238.GA277@x4>
Date:   Mon, 31 Jul 2017 14:22:38 +0200
From:   Markus Trippelsdorf <markus@...ppelsdorf.de>
To:     Alan Cox <gnomes@...rguk.ukuu.org.uk>
Cc:     Satoru Takeuchi <satoru.takeuchi@...il.com>,
        LKML <linux-kernel@...r.kernel.org>, x86@...nel.org
Subject: Re: [FYI] GCC segfaults under heavy multithreaded compilation with
 AMD Ryzen

On 2017.07.31 at 13:04 +0100, Alan Cox wrote:
> On Wed, 26 Jul 2017 06:54:01 +0900
> Satoru Takeuchi <satoru.takeuchi@...il.com> wrote:
> 
> > # I'm a LKML subscriber, but not a x86 list subscriber
> > 
> > I found the following new linux kernel bugzilla about Ryzen related problem.
> > Since many developers don't check this bugzilla and I've also
> > encountered this problem,
> > I decided to introduce this problem here.
> 
> Historically we've seen exactly these symptoms on all kinds of systems
> where the memory is at fault, even in cases where memtest86 passes.
> Whether there's a specific problem on some Ryzen boards is a question for
> AMD, but if I saw this without knowing the CPU I'd suspect memory
> firstly. GCC it turns out is by accident an amazingly effective memory
> testing tool.
> 
> If it is memory corruption problems then no - the kernel cannot work
> around that level of hardware failure. The BIOS may be able to if it is a
> board or compatibility problem as memory tuning is usually done by the
> BIOS.

People are seeing these segfaults even with ECC memory (and EDAC
enabled). There are no ECC related MCEs in their logs.

Also for some the segfaults are gone after they RMAed their CPU.
Others are not so lucky and they still see segfaults after RMA.

For me it looks like a chip binning issue 

-- 
Markus

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ