lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <87o911vktx.fsf@concordia.ellerman.id.au>
Date:   Wed, 07 Aug 2019 11:13:46 +1000
From:   Michael Ellerman <mpe@...erman.id.au>
To:     Chris Packham <Chris.Packham@...iedtelesis.co.nz>,
        "linuxppc-dev\@lists.ozlabs.org" <linuxppc-dev@...ts.ozlabs.org>,
        "christophe.leroy\@c-s.fr" <christophe.leroy@....fr>,
        "npiggin\@gmail.com" <npiggin@...il.com>
Cc:     "linux-kernel\@vger.kernel.org" <linux-kernel@...r.kernel.org>,
        Grant McEwan <grant.mcewan@...iedtelesis.co.nz>
Subject: Re: SMP lockup at boot on Freescale/NXP T2080 (powerpc 64)

Chris Packham <Chris.Packham@...iedtelesis.co.nz> writes:
> On Tue, 2019-08-06 at 21:32 +1000, Michael Ellerman wrote:
>> Chris Packham <Chris.Packham@...iedtelesis.co.nz> writes:
>> > On Mon, 2019-08-05 at 14:06 +1200, Chris Packham wrote:
>> > > 
>> > > Hi All,
>> > > 
>> > > I have a custom board that uses the Freescale/NXP T2080 SoC.
>> > > 
>> > > The board boots fine using v4.19.60 but when I use v5.1.21 it
>> > > locks
>> > > up
>> > > waiting for the other CPUs to come online (earlyprintk output
>> > > below).
>> > > If I set maxcpus=0 then the system boots all the way through to
>> > > userland. The same thing happens with 5.3-rc2.
>> > > 
>> > > The defconfig I'm using is 
>> > > https://gist.github.com/cpackham/f24d0b426f3
>> > > de0eaaba17b82c3528a9d it was updated from the working v4.19.60
>> > > defconfig using make olddefconfig.
>> > > 
>> > > Does this ring any bells for anyone?
>> > > 
>> > > I haven't dug into the differences between the working an non-
>> > > working
>> > > versions yet. I'll start looking now.
>> > I've bisected this to the following commit
>> Thanks that's super helpful.
>> 
>> > 
>> > commit ed1cd6deb013a11959d17a94e35ce159197632da
>> > Author: Christophe Leroy <christophe.leroy@....fr>
>> > Date:   Thu Jan 31 10:08:58 2019 +0000
>> > 
>> >     powerpc: Activate CONFIG_THREAD_INFO_IN_TASK
>> >     
>> >     This patch activates CONFIG_THREAD_INFO_IN_TASK which
>> >     moves the thread_info into task_struct.
>> > 
>> > I'll be the first to admit this is well beyond my area of knowledge
>> > so
>> > I'm unsure what about this patch is problematic but I can be fairly
>> > sure that a build immediately before this patch works while a build
>> > with this patch hangs.
>> It makes a pretty fundamental change to the way the kernel stores
>> some
>> information about each task, moving it off the stack and into the
>> task
>> struct.
>> 
>> It definitely has the potential to break things, but I thought we had
>> reasonable test coverage of the Book3E platforms, I have a p5020ds
>> (e5500) that I boot as part of my CI.
>> 
>> Aha. If I take your config and try to boot it on my p5020ds I get the
>> same behaviour, stuck at SMP bringup. So it seems it's something in
>> your
>> config vs corenet64_smp_defconfig that is triggering the bug.
>> 
>> Can you try bisecting what in the config triggers it?
>> 
>> To do that you checkout ed1cd6deb013a11959d17a94e35ce159197632da,
>> then
>> you build/boot with corenet64_smp_defconfig to confirm it works. Then
>> you use tools/testing/ktest/config-bisect.pl to bisect the changes in
>> the .config.
>> 
>
> The difference between a working and non working defconfig is
> CONFIG_PREEMPT specifically CONFIG_PREEMPT=y makes my system hang at
> boot.
>
> Is that now intentionally prohibited on 64-bit powerpc?

It's not prohibitied, but it probably should be because no one really
tests it properly. I have a handful of IBM machines where I boot a
PREEMPT kernel but that's about it.

The corenet configs don't have PREEMPT enabled, which suggests it was
never really supported on those machines.

But maybe someone from NXP can tell me otherwise.

cheers

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ