[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <87o9pocvjq.fsf@concordia.ellerman.id.au>
Date: Tue, 03 Oct 2017 22:36:41 +1100
From: Michael Ellerman <mpe@...erman.id.au>
To: Thomas Gleixner <tglx@...utronix.de>
Cc: LKML <linux-kernel@...r.kernel.org>,
Ingo Molnar <mingo@...nel.org>,
Peter Zijlstra <peterz@...radead.org>,
Borislav Petkov <bp@...en8.de>,
Andrew Morton <akpm@...ux-foundation.org>,
Sebastian Siewior <bigeasy@...utronix.de>,
Nicholas Piggin <npiggin@...il.com>,
Don Zickus <dzickus@...hat.com>,
Chris Metcalf <cmetcalf@...lanox.com>,
Ulrich Obergfell <uobergfe@...hat.com>,
Benjamin Herrenschmidt <benh@...nel.crashing.org>,
linuxppc-dev@...ts.ozlabs.org
Subject: Re: [patch V2 22/29] lockup_detector: Make watchdog_nmi_reconfigure() two stage
Thomas Gleixner <tglx@...utronix.de> writes:
> On Tue, 3 Oct 2017, Michael Ellerman wrote:
>> Hi Thomas,
>> Unfortunately this is hitting the WARN_ON in start_wd_cpu() on powerpc
>> because we're calling it multiple times for the boot CPU.
>>
>> The first call is via:
>>
>> start_wd_on_cpu+0x80/0x2f0
>> watchdog_nmi_reconfigure+0x124/0x170
>> softlockup_reconfigure_threads+0x110/0x130
>> lockup_detector_init+0xbc/0xe0
>> kernel_init_freeable+0x18c/0x37c
>> kernel_init+0x2c/0x160
>> ret_from_kernel_thread+0x5c/0xbc
>>
>> And then again via the CPU hotplug registration:
>>
>> start_wd_on_cpu+0x80/0x2f0
>> cpuhp_invoke_callback+0x194/0x620
>> cpuhp_thread_fun+0x7c/0x1b0
>> smpboot_thread_fn+0x290/0x2a0
>> kthread+0x168/0x1b0
>> ret_from_kernel_thread+0x5c/0xbc
>>
>>
>> The first call is new because previously watchdog_nmi_reconfigure()
>> wasn't called from softlockup_reconfigure_threads().
>
> Hmm, don't you have the same problem with CPU hotplug or do you just get
> lucky because the hotplug callback in your code is ordered vs. the
> softlockup thread hotplug callback in a way that this does not hit?
I don't see it with CPU hotplug.
AFAICS that's because softlockup_reconfigure_threads() isn't called for
CPU hotplug. Unless there's a path I'm missing?
>> I'm not sure what the easiest fix is. One option would be to just drop
>> the WARN_ON, it's just there for paranoia AFAICS.
>
> The straight forward way is to make use of the new probe function. Patch
> below.
Thanks.
Hmm, I tried that patch, it makes the warning go away. But then I
triggered a deliberate hard lockup and got nothing.
Then I went back to the existing code (in linux-next), and I still get
no warning from a deliberate hard lockup.
So seems there may be some more gremlins. Will test more in the morning.
cheers
Powered by blists - more mailing lists