lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAMj1kXH-BKvykS0wL5BCv5Eh4FWMZxHmM6nHV8MeRACUbWjCPw@mail.gmail.com>
Date:   Mon, 16 Nov 2020 13:20:22 +0100
From:   Ard Biesheuvel <ardb@...nel.org>
To:     Russell King - ARM Linux admin <linux@...linux.org.uk>
Cc:     Guillaume Tucker <guillaume.tucker@...labora.com>,
        Nicolas Pitre <nico@...xnic.net>,
        Linus Walleij <linus.walleij@...aro.org>,
        kernelci-results@...ups.io,
        Linux ARM <linux-arm-kernel@...ts.infradead.org>,
        Olof Johansson <olof@...om.net>,
        Mike Rapoport <rppt@...nel.org>, Marc Zyngier <maz@...nel.org>,
        Anshuman Khandual <anshuman.khandual@....com>,
        Arvind Sankar <nivedita@...m.mit.edu>,
        Linux Doc Mailing List <linux-doc@...r.kernel.org>,
        Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
        Andrew Morton <akpm@...ux-foundation.org>,
        Miguel Ojeda <miguel.ojeda.sandonis@...il.com>,
        Jonathan Corbet <corbet@....net>,
        Collabora Kernel ML <kernel@...labora.com>
Subject: Re: rmk/for-next bisection: baseline.login on bcm2836-rpi-2-b

On Mon, 16 Nov 2020 at 12:20, Ard Biesheuvel <ardb@...nel.org> wrote:
>
> On Sun, 15 Nov 2020 at 15:11, Ard Biesheuvel <ardb@...nel.org> wrote:
> >
> > On Fri, 13 Nov 2020 at 17:25, Ard Biesheuvel <ardb@...nel.org> wrote:
> > >
> > > On Fri, 13 Nov 2020 at 17:15, Ard Biesheuvel <ardb@...nel.org> wrote:
> > > >
> > > > On Fri, 13 Nov 2020 at 16:58, Russell King - ARM Linux admin
> > > > <linux@...linux.org.uk> wrote:
> > > > >
> > > > > On Fri, Nov 13, 2020 at 03:43:27PM +0000, Guillaume Tucker wrote:
> > > > > > On 13/11/2020 10:35, Ard Biesheuvel wrote:
> > > > > > > On Fri, 13 Nov 2020 at 11:31, Guillaume Tucker
> > > > > > > <guillaume.tucker@...labora.com> wrote:
> > > > > > >>
> > > > > > >> Hi Ard,
> > > > > > >>
> > > > > > >> Please see the bisection report below about a boot failure on
> > > > > > >> RPi-2b.
> > > > > > >>
> > > > > > >> Reports aren't automatically sent to the public while we're
> > > > > > >> trialing new bisection features on kernelci.org but this one
> > > > > > >> looks valid.
> > > > > > >>
> > > > > > >> There's nothing in the serial console log, probably because it's
> > > > > > >> crashing too early during boot.  I'm not sure if other platforms
> > > > > > >> on kernelci.org were hit by this in the same way, but there
> > > > > > >> doesn't seem to be any.
> > > > > > >>
> > > > > > >> The same regression can be see on rmk's for-next branch as well
> > > > > > >> as in linux-next.  It happens with both bcm2835_defconfig and
> > > > > > >> multi_v7_defconfig.
> > > > > > >>
> > > > > > >> Some more details can be found here:
> > > > > > >>
> > > > > > >>   https://kernelci.org/test/case/id/5fae44823818ee918adb8864/
> > > > > > >>
> > > > > > >> If this looks like a real issue but you don't have a platform at
> > > > > > >> hand to reproduce it, please let us know if you would like the
> > > > > > >> KernelCI test to be re-run with earlyprintk or some debug config
> > > > > > >> turned on, or if you have a fix to try.
> > > > > > >>
> > > > > > >> Best wishes,
> > > > > > >> Guillaume
> > > > > > >>
> > > > > > >
> > > > > > > Hello Guillaume,
> > > > > > >
> > > > > > > That patch did have an issue, but it was already fixed by
> > > > > > >
> > > > > > > https://www.armlinux.org.uk/developer/patches/viewpatch.php?id=9020/1
> > > > > > > https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/commit/?id=fc2933c133744305236793025b00c2f7d258b687
> > > > > > >
> > > > > > > Could you please double check whether cherry-picking that on top of
> > > > > > > the first bad commit fixes the problem?
> > > > > >
> > > > > > Sadly this doesn't appear to be fixing the issue.  I've
> > > > > > cherry-picked your patch on top of the commit found by the
> > > > > > bisection but it still didn't boot, here's the git log
> > > > > >
> > > > > > cbb9656e83ca ARM: 9020/1: mm: use correct section size macro to describe the FDT virtual address
> > > > > > 7a1be318f579 ARM: 9012/1: move device tree mapping out of linear region
> > > > > > e9a2f8b599d0 ARM: 9011/1: centralize phys-to-virt conversion of DT/ATAGS address
> > > > > > 3650b228f83a Linux 5.10-rc1
> > > > > >
> > > > > > Test log: https://people.collabora.com/~gtucker/lava/boot/rpi-2-b/v5.10-rc1-3-gcbb9656e83ca/
> > > > > >
> > > > > > There's no output so it's hard to tell what is going on, but
> > > > > > reverting the bad commmit does make the board to boot (that's
> > > > > > what "revert: PASS" means in the bisect report).  So it's
> > > > > > unlikely that there is another issue causing the boot failure.
> > > > >
> > > > > These silent boot failures are precisely what the DEBUG_LL stuff (and
> > > > > early_printk) is supposed to help with - getting the kernel messages
> > > > > out when there is an oops before the serial console is initialised.
> > > > >
> > > >
> > > > If this is indeed related to the FDT mapping, I would assume
> > > > earlycon=... to be usable here.
> > > >
> > > > I will try to reproduce this on a RPi3 but I don't have a RPi2 at
> > > > hand, unfortunately.
> > > >
> > > > Would you mind having a quick try whether you can reproduce this on
> > > > QEMU, using the raspi2 machine model? If so, that would be a *lot*
> > > > easier to diagnose.
> > >
> > > Also, please have a go with 'earlycon=pl011,0x3f201000' added to the
> > > kernel command line.
> >
> > I cannot reproduce this - I don't have the exact same hardware, but
> > for booting the kernel, I think RPi2 and RPi3 should be sufficiently
> > similar, and I can boot on Rpi3 using a u-boot built for rpi2 using
> > your provided dtb for RPi2.
> >
> > What puzzles me is that u-boot reports itself as
> >
> > U-Boot 2016.03-rc1-00131-g39af3d8-dirty
> >
> > RPI Model B+ (0x10)
> >
> > which is the ARMv6 model not the ARMv7, but then the kernel reports
> >
> > CPU: ARMv7 Processor [410fc075] revision 5 (ARMv7), cr=10c53c7d
> >
>
> Another thing I noticed is that the bootloader on these boards loads
> the FDT at address 0x100, which is described by the FDT itself as
> reserved memory, and which typically holds the spin tables used for
> SMP boot.
>
> Could you try loading the DT elsewhere, and see if that changes anything?

I think I narrowed this down to the early DT mapping code, which
considers any DT address that falls inside the first section as 'no
DT', and then relies on the first section mapping of the decompressed
kernel to cover it instead.

Could you please try the following change?


diff --git a/arch/arm/kernel/head.S b/arch/arm/kernel/head.S
index 28687fd1240a..7f62c5eccdf3 100644
--- a/arch/arm/kernel/head.S
+++ b/arch/arm/kernel/head.S
@@ -265,10 +265,10 @@ __create_page_tables:
         * We map 2 sections in case the ATAGs/DTB crosses a section boundary.
         */
        mov     r0, r2, lsr #SECTION_SHIFT
-       movs    r0, r0, lsl #SECTION_SHIFT
+       cmp     r2, #0
        ldrne   r3, =FDT_FIXED_BASE >> (SECTION_SHIFT - PMD_ORDER)
        addne   r3, r3, r4
-       orrne   r6, r7, r0
+       orrne   r6, r7, r0, lsl #SECTION_SHIFT
        strne   r6, [r3], #1 << PMD_ORDER
        addne   r6, r6, #1 << SECTION_SHIFT
        strne   r6, [r3]

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ