[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <87blnqib81.fsf@mpe.ellerman.id.au>
Date: Fri, 17 Apr 2020 21:49:02 +1000
From: Michael Ellerman <mpe@...erman.id.au>
To: "Naveen N. Rao" <naveen.n.rao@...ux.ibm.com>,
Qian Cai <cai@....pw>, Russell Currey <ruscur@...sell.cc>
Cc: LKML <linux-kernel@...r.kernel.org>,
linuxppc-dev <linuxppc-dev@...ts.ozlabs.org>,
Nicholas Piggin <npiggin@...il.com>,
Steven Rostedt <rostedt@...dmis.org>
Subject: Re: POWER9 crash due to STRICT_KERNEL_RWX (WAS: Re: Linux-next POWER9 NULL pointer NIP...)
"Naveen N. Rao" <naveen.n.rao@...ux.ibm.com> writes:
> Hi Qian,
>
> Qian Cai wrote:
>> OK, reverted the commit,
>>
>> c55d7b5e6426 (“powerpc: Remove STRICT_KERNEL_RWX incompatibility with RELOCATABLE”)
>>
>> or set STRICT_KERNEL_RWX=n fixed the crash below and also mentioned in this thread,
>>
>> https://lore.kernel.org/lkml/15AC5B0E-A221-4B8C-9039-FA96B8EF7C88@lca.pw/
>
> Do you see any errors logged in dmesg when you see the crash?
> STRICT_KERNEL_RWX changes how patch_instruction() works, so it would be
> interesting to see if there are any ftrace-related errors thrown before
> the crash.
I've been able to reproduce with STRICT_KERNEL_RWX=y and concurrently
running:
# while true; do echo function > /sys/kernel/debug/tracing/current_tracer ; echo nop > /sys/kernel/debug/tracing/current_tracer ; done
and:
# while true; do find /lib/modules/$(uname -r) -name '*.ko' -printf "%f\n" | sed -e "s/\.ko//" | xargs -i modprobe -va {}; lsmod | awk '{print $1}' | xargs -i modprobe -vr {}; done
ie. stressing module loading/unloading and ftrace at the same time.
It's not 100% but it usually reproduces within 10-20 minutes.
It looks like sometimes our __patch_instruction() fails, and then that
somehow leads to things getting further messed up. Presumably we have
some bad error handling somewhere.
cheers
Powered by blists - more mailing lists