linux-kernel - Re: [PATCH v2] watchdog: Fix possible soft lockup warning at bootup

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20200116151146.wn6ec7igl2bfk4c2@rric.localdomain>
Date:   Thu, 16 Jan 2020 15:11:55 +0000
From:   Robert Richter <rrichter@...vell.com>
To:     Thomas Gleixner <tglx@...utronix.de>
CC:     Waiman Long <longman@...hat.com>,
        Andrew Morton <akpm@...ux-foundation.org>,
        Ingo Molnar <mingo@...nel.org>,
        "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
        Mike Rapoport <rppt@...ux.ibm.com>,
        Kees Cook <keescook@...omium.org>,
        Catalin Marinas <catalin.marinas@....com>,
        Will Deacon <will@...nel.org>,
        Peter Zijlstra <peterz@...radead.org>
Subject: Re: [PATCH v2] watchdog: Fix possible soft lockup warning at bootup

Thanks Thomas for cc'ing me.

On 16.01.20 12:44:07, Thomas Gleixner wrote:
> Thomas Gleixner <tglx@...utronix.de> writes:
> 
> Added ARM64 and ThunderX folks 
> 
> > Waiman Long <longman@...hat.com> writes:
> >> By adding some instrumentation code, it was found that for cpu 14,
> >> watchdog_enable() was called early with a timestamp of 1. That activates
> >> the watchdog time checking logic. It was also found that the monotonic
> >> time measured during the smp_init() phase runs much slower than the
> >> real elapsed time as shown by the below debug printf output:
> >>
> >>   [    1.138522] run_queues, watchdog_timer_fn: now =  170000000
> >>   [   25.519391] run_queues, watchdog_timer_fn: now = 4170000000
> >>
> >> In this particular case, it took about 24.4s of elapsed time for the
> >> clock to advance 4s which is the soft expiration time that is required
> >> to trigger the calling of watchdog_timer_fn(). That clock slowdown
> >> stopped once the smp_init() call was done and the clock time ran at
> >> the same rate as the elapsed time afterward.
> 
> And looking at this with a more awake brain, the root cause is pretty
> obvious.
> 
> sched_clock() advances by 24 seconds, but clock MONOTONIC on which the
> watchdog timer is based does not. As the timestamps you printed have 7
> trailing zeros, it's pretty clear that timekeeping is still jiffies
> based at this point and HZ is set to 100.
> 
> So while bringing up the non-boot CPUs the boot CPU loses ~2000 timer
> interrupts. That needs to be fixed and not papered over.

I have looked into this back in December and observed the following
soft lockups on cpu #15 to #155 (out of 224):

[   22.697242] watchdog: BUG: soft lockup - CPU#15 stuck for 20s! [swapper/15:0]
[   22.990042] watchdog: BUG: soft lockup - CPU#16 stuck for 20s! [swapper/16:0]
[   23.139546] watchdog: BUG: soft lockup - CPU#17 stuck for 20s! [swapper/17:0]
[   23.280614] watchdog: BUG: soft lockup - CPU#18 stuck for 20s! [swapper/18:0]
[   23.431378] watchdog: BUG: soft lockup - CPU#20 stuck for 20s! [swapper/20:0]
[   23.431747] watchdog: BUG: soft lockup - CPU#19 stuck for 20s! [swapper/19:0]
[   23.722134] watchdog: BUG: soft lockup - CPU#21 stuck for 21s! [swapper/21:0]
[   23.723440] watchdog: BUG: soft lockup - CPU#22 stuck for 21s! [swapper/22:0]
[   24.010887] watchdog: BUG: soft lockup - CPU#36 stuck for 20s! [swapper/36:0]
...
[   25.149322] watchdog: BUG: soft lockup - CPU#150 stuck for 11s! [swapper/150:0]
[   25.155822] watchdog: BUG: soft lockup - CPU#151 stuck for 11s! [swapper/151:0]
[   25.162345] watchdog: BUG: soft lockup - CPU#152 stuck for 11s! [swapper/152:0]
[   25.178463] watchdog: BUG: soft lockup - CPU#153 stuck for 11s! [swapper/153:0]
[   25.184929] watchdog: BUG: soft lockup - CPU#154 stuck for 11s! [swapper/154:0]
[   25.191382] watchdog: BUG: soft lockup - CPU#155 stuck for 11s! [swapper/155:0]

The reason it starts at #15 is that this is the first cpu where the
timer starts at 1 sec, cpus #1 to #14 have a value of 0 which is
handled special (see watchdog_timer_fn(), if (touch_ts == 0) ...).
Otherwise those cpus would also show the soft lockup.

The reason it stops after cpu #155 is that the boot process advanced
(25 secs) so that watchdog_touch_ts is now updated properly.

During secondary boot cpu bringup I have seen the hrtimer callback
watchdog_timer_fn() running correctly. get_timestamp() is also
correct. But watchdog_touch_ts is not updated since softlockup_fn() is
not running for more than 23 secs, it still keeps the original time
from watchdog_enable() (e.g. 1 for cpu #15). So the high prio
stop_worker thread which calls softlockup_fn() is not scheduled for
some reason.  But it looks like scheduling starts around the time the
primary cpu has scheduled all secondary cpus to become online
(smp_init()).

[   23.431441] watchdog: __touch_watchdog:277: cpu 20, old: 1, new: 21, from: softlockup_fn+0x14/0x38, thresh: 5
[   23.877843] CPU223: Booted secondary processor 0x0000011b03 [0x431f0af2]

So your theory the MONOTONIC clock runs differently/wrongly could
explain that (assuming this drives the sched clock). Though, I am
wondering in what state the scheduler is until to the point it is
finally starting to run, idle?

On the other side, could it be that the scheduler starts running only
after the primary cpu finishs smp_init()? Meaning, if that takes
longer than the soft_thresh we will see soft lockups?

Note: We have seen it in a bios change causing longer delays when
executing the onlining of cpus by the fw call.

Thanks,

-Robert