[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <aa8d59de-7ff4-153d-082f-f5eb400cd6b6@roeck-us.net>
Date: Thu, 8 Sep 2022 06:47:06 -0700
From: Guenter Roeck <linux@...ck-us.net>
To: Christophe Leroy <christophe.leroy@...roup.eu>,
Oliver O'Halloran <oohall@...il.com>
Cc: linuxppc-dev <linuxppc-dev@...ts.ozlabs.org>,
Paul Mackerras <paulus@...ba.org>,
Geert Uytterhoeven <geert@...ux-m68k.org>,
Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
Michael Ellerman <mpe@...erman.id.au>
Subject: Re: [PATCH] powerpc/chrp: Revert "Move PHB discovery" and "Make
hydra_init() static"
On 9/7/22 23:23, Christophe Leroy wrote:
> Hi All,
>
> Le 18/07/2021 à 18:09, Oliver O'Halloran a écrit :
>> On Sun, Jul 18, 2021 at 2:24 AM Guenter Roeck <linux@...ck-us.net> wrote:
>>>
>>> On Sat, Jul 17, 2021 at 05:57:50PM +0200, Christophe Leroy wrote:
>>>> Guenter Roeck <linux@...ck-us.net> a écrit :
>>>>
>>>>> This patch reverts commit 407d418f2fd4 ("powerpc/chrp: Move PHB
>>>>> discovery") and commit 9634afa67bfd ("powerpc/chrp: Make hydra_init()
>>>>> static").
>>>>>
>>>>> Running the upstream kernel on Qemu's brand new "pegasos2" emulation
>>>>> results in a variety of backtraces such as
>>>>>
>>>>> Kernel attempted to write user page (a1) - exploit attempt? (uid: 0)
>>>>> ------------[ cut here ]------------
>>>>> Bug: Write fault blocked by KUAP!
>>>>> WARNING: CPU: 0 PID: 0 at arch/powerpc/mm/fault.c:230
>>>>> do_page_fault+0x4f4/0x920
>>>>> CPU: 0 PID: 0 Comm: swapper Not tainted 5.13.2 #40
>>>>> NIP: c0021824 LR: c0021824 CTR: 00000000
>>>>> REGS: c1085d50 TRAP: 0700 Not tainted (5.13.2)
>>>>> MSR: 00021032 <ME,IR,DR,RI> CR: 24042254 XER: 00000000
>>>>>
>>>>> GPR00: c0021824 c1085e10 c0f8c520 00000021 3fffefff c1085c60 c1085c58
>>>>> 00000000
>>>>> GPR08: 00001032 00000000 00000000 c0ffb3ec 44042254 00000000 00000000
>>>>> 00000004
>>>>> GPR16: 00000000 ffffffff 000000c4 000000d0 0188c6e0 01006000 00000001
>>>>> 40b14000
>>>>> GPR24: c0ec000c 00000300 02000000 00000000 42000000 000000a1 00000000
>>>>> c1085e60
>>>>> NIP [c0021824] do_page_fault+0x4f4/0x920
>>>>> LR [c0021824] do_page_fault+0x4f4/0x920
>>>>> Call Trace:
>>>>> [c1085e10] [c0021824] do_page_fault+0x4f4/0x920 (unreliable)
>>>>> [c1085e50] [c0004254] DataAccess_virt+0xd4/0xe4
>>>>>
>>>>> and the system fails to boot. Bisect points to commit 407d418f2fd4
>>>>> ("powerpc/chrp: Move PHB discovery"). Reverting this patch together with
>>>>> commit 9634afa67bfd ("powerpc/chrp: Make hydra_init() static") fixes
>>>>> the problem.
>>>>
>>>> Isn't there more than that in the backtrace ? If there is a fault blocked by
>>>> Kuap, it means there is a fault. It should be visible in the traces.
>>>>
>>>> Should we fix the problem instead of reverting the commit that made the
>>>> problem visible ?
>>>>
>>>
>>> I do not think the patch reverted here made the problem visible. I am
>>> quite sure that it introduced it. AFAIS the problem is that the new code
>>> initializes and remaps PCI much later, after it is being used.
>>
>> Right. The bug is that on 32bit platforms the PHB setup also maps one
>> of the PHB's IO space as "ISA IO space" as a side effect. There's a
>> handful of platforms (pegasos2 is one) which use an i8259 interrupt
>> controller and configuring that requires access to IO / ISA space. The
>> KUAP faults we're setting are because isa_io_base is still set to zero
>> so outb() and friends are accessing the zero page.
>>
>> I don't think there's any real reason why we need to have PCI fully
>> set up to handle that situation. A few platforms already have early
>> fixup code which parses the DT directly rather than using the fields
>> of pci_controller (which are parsed from the DT anyway) and I'm pretty
>> sure we can do something similar.
>>
>>> Also, the
>>> patch introducing the problem was never tested on real hardware (it even
>>> says so in the patch comments). That by itself seems to be quite
>>> problematic for such an invasive patch, and makes me wonder if some of
>>> the other PHB discovery related patches introduced similar problems.
>>
>> The legacy platforms are maintained on a best-effort basis. Ellerman's
>> CI farm covers most of the powerpc CPU types, but there's no real way
>> to test the bulk of the platforms in the tree since most of the
>> hardware is currently in landfill.
>>
>>> Anyway, I do not use or have that hardware. I was just playing with the
>>> latest version of qemu and ended up tracking down why its brand new
>>> pegasos2 emulation no longer boots with the latest Linux kernel.
>>> I personally don't care too much if ppc/chrp support is broken in the
>>> upstream kernel or not. Please take this patch as problem report,
>>> and feel free to do with it whatever you like, including ignoring it.
>>
>> Problem reports are fine and appreciated. I'd be less cranky if you
>> included the kernel config you used in the initial report since I
>> wasted an hour of my saturday trying to replicate it with various
>> kernel configs that had SMP enabled since that's what the
>> chrp_defconfig uses.
>>
>
>
> What is the status now ? I see this revert patch is still hanging around in patchwork, is there still a problem ?
>
Presumably, but my understanding was "we are not going to fix this",
so I stopped testing with the pegasos2 emulation a long time ago.
Guenter
Powered by blists - more mailing lists