lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <YKKFOoMVfAZtDWqE@alley>
Date:   Mon, 17 May 2021 17:01:14 +0200
From:   Petr Mladek <pmladek@...e.com>
To:     Thomas Gleixner <tglx@...utronix.de>,
        Ingo Molnar <mingo@...nel.org>,
        Peter Zijlstra <peterz@...radead.org>,
        Andrew Morton <akpm@...ux-foundation.org>,
        Laurence Oberman <loberman@...hat.com>,
        Vincent Whitchurch <vincent.whitchurch@...s.com>,
        Michal Hocko <mhocko@...e.com>, linux-kernel@...r.kernel.org
Subject: Re: [PATCH v2 6/7] watchdog: Cleanup handling of false positives

On Sun 2021-05-16 19:46:21, Sergey Senozhatsky wrote:
> Hi,
> 
> // This was never in my inbox, so sorry if I mess up the "Reply-to"
> // Original message:  https://lore.kernel.org/lkml/20210311122130.6788-7-pmladek@suse.com/
> 
> 
> >@@ -375,7 +375,14 @@ static enum hrtimer_restart watchdog_timer_fn(struct hrtimer *hrtimer)
> > 	/* .. and repeat */
> > 	hrtimer_forward_now(hrtimer, ns_to_ktime(sample_period));
> >
> > -	/* Reset the interval when touched externally by a known slow code. */
> > +	/*
> > +	 * If a virtual machine is stopped by the host it can look to
> > +	 * the watchdog like a soft lockup. Check to see if the host
> > +	 * stopped the vm before we process the timestamps.
> > +	 */
> > +	kvm_check_and_clear_guest_paused();
> > +
> [..]
> >@@ -401,14 +405,6 @@ static enum hrtimer_restart watchdog_timer_fn(struct hrtimer *hrtimer)
> > 	 */
> > 	duration = is_softlockup(touch_ts, period_ts);
> > 	if (unlikely(duration)) {
> > -		/*
> > -		 * If a virtual machine is stopped by the host it can look to
> > -		 * the watchdog like a soft lockup, check to see if the host
> > -		 * stopped the vm before we issue the warning
> > -		 */
> > -		if (kvm_check_and_clear_guest_paused())
> > -			return HRTIMER_RESTART;
> 
> This looks racy to me. I believe kvm_check_and_clear_guest_paused()
> was in the right place.
> 
> VCPU can be scheduled out/preepmpted any time at any point; and then
> guest VM (or even the entire system) can be suspended. When we resume
> the VM we continue from where we were preempted (from VCPU POW).
> 
> So what the old code did
> 
> watchdog_timer_fn()
> {
> 	...
> 	<<!!>>
> 
> 	// Suppose we are suspended here. When we are getting resumed
> 	// jiffies jump forward, which may look like a soft lockup.
> 	duration = is_softlockup(touch_ts, period_ts);
> 	if (unlikely(duration)) {
> 		// And this is where kvm_check_and_clear_guest_paused()
> 		// jumps in. We know already that jiffies have jumped,
> 		// we don't know if jiffies jumped because the VM was
> 		// suspended. And this is what we figure out here and
> 		// bail out
> 		if (kvm_check_and_clear_guest_paused())
> 			return HRTIMER_RESTART;
> 	}
> }
> 
> The new code does the following
> 
> watchdog_timer_fn()
> {
> 	...
> 	kvm_check_and_clear_guest_paused(); // PVCLOCK_GUEST_STOPPED is not set
> 
> 	<<!!>>
> 
> 	// Suppose the VM got suspended at this point. PVCLOCK_GUEST_STOPPED
> 	// is set, but we don't check it. jiffies will jump and this will look
> 	// like a lockup, but we don't check if jiffies jumped because the VM
> 	// was suspended
> 	duration = is_softlockup(touch_ts, period_ts);
> 	if (unlikely(duration)) {
> 		// report the lockup and perhaps panic the system,
> 		// depending on the configuration
> 	}
> }
> 
> What am I missing?

Great catch! You have a point.

Well, I think that the entire code is racy. touch_ts and period_ts are
set by:

	unsigned long touch_ts = __this_cpu_read(watchdog_touch_ts);
	unsigned long period_ts = __this_cpu_read(watchdog_report_ts);

They are neither volatile not there are any barriers. It means that
period_ts might be re-read in these two checks:

	/* Reset the interval when touched by known problematic code. */
	if (period_ts == SOFTLOCKUP_DELAY_REPORT) {
		update_report_ts();
		return HRTIMER_RESTART;
	}

and

	duration = is_softlockup(touch_ts, period_ts);


where:

static int is_softlockup(unsigned long touch_ts, unsigned long period_ts)
{
	unsigned long now = get_timestamp();

	if ((watchdog_enabled & SOFT_WATCHDOG_ENABLED) && watchdog_thresh){
		/* Warn about unreasonable delays. */
		if (time_after(now, period_ts + get_softlockup_thresh()))
			return now - touch_ts;
	}
	return 0;
}

Now, if the watchdog is touched from NMI. period_ts might be
SOFTLOCKUP_DELAY_REPORT. It is ULONG_MAX.

As a result period_ts + get_softlockup_thresh() would overflow and
we could report softlockup even when there is none.

I probably does not happen because the per-CPU variable is read only
once. And watchdogs are not typically touched from NMI. Except that
show_regs() actually touch the watchdog.

That said. This patch most likely made things worse and should be
reverted. But even better solution would be to remove this race.
I mean to make the code safe and sane at the same time.

Best Regards,
Petr

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ