lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Thu, 8 Mar 2018 09:49:18 -0800
From:   Randy Dunlap <rdunlap@...radead.org>
To:     Brian Rak <brak@...eservers.com>, linux-kernel@...r.kernel.org
Subject: Re: Hang while booting 4.15.7

On 03/08/2018 08:21 AM, Brian Rak wrote:
> We have some Dell servers running Intel Gold 6126 processors. Some of them hang on boot under 4.15.7,  but work fine on 4.14.14.  When they hang, we see the following on console:
> 
> Error parsing PCC subspaces from PCCT
> watchdog: BUG: soft lockup - CPU #16 stuck for 23s! [swapper/0:1]
> 
> We see that PCC subspaces error under 4.14 as well, but it doesn't cause the machine to hang.
> 
> So far we haven't been able to correlate these hangs with anything in particular.  Some machines will hang, some machines will boot.  They're otherwise identical as far as hardware and firmware goes.
> 
> I've tried pcie_aspm=off, since that seems to be the next bit of code that's being executed.  This resulted in the machine booting a little further, but then oopsing somewhere in acpi_os_purge_cache. I'm not able to get a full trace there, as I don't have serial access easily available.
> 
> Any suggestions?
> 

Hi,

The first thing that I would do is boot with:
  ignore_loglevel initcall_debug
on the kernel boot command line.

That will add lots of messages and maybe give us a stronger hint about where
the hang is actually happening.

And then worst case (without a boot log via serial console or netconsole) is
to take a photo of the screen with the oops messages.

And if you are fairly certain that it's an ACPI issue, also write to the
linux-acpi@...r.kernel.org mailing list.

-- 
~Randy

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ