lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Mon, 5 Jan 2015 14:10:13 -0800
From:	Andrew Morton <akpm@...ux-foundation.org>
To:	Cyril Bur <cyrilbur@...il.com>
Cc:	linux-kernel@...r.kernel.org, mpe@...erman.id.au,
	drjones@...hat.com, dzickus@...hat.com, mingo@...nel.org,
	uobergfe@...hat.com, chaiw.fnst@...fujitsu.com, cl@...u.com,
	fabf@...net.be, atomlin@...hat.com, benzh@...omium.org
Subject: Re: [PATCH 2/2] powerpc: add running_clock for powerpc to prevent
 spurious softlockup warnings

On Mon, 22 Dec 2014 16:06:04 +1100 Cyril Bur <cyrilbur@...il.com> wrote:

> On POWER8 virtualised kernels the VTB register can be read to have a view of
> time that only increases while the guest is running. This will prevent guests
> from seeing time jump if a guest is paused for significant amounts of time.
> 
> On POWER7 and below virtualised kernels stolen time is subtracted from
> sched_clock as a best effort approximation. This will not eliminate spurious
> warnings in the case of a suspended guest but may reduce the occurance in the
> case of softlockups due to host over commit.
> 
> Bare metal kernels should avoid reading the VTB as KVM does not restore sane
> values when not executing. sched_clock is returned in this case.
> 
> --- a/arch/powerpc/kernel/time.c
> +++ b/arch/powerpc/kernel/time.c
> @@ -621,6 +621,30 @@ unsigned long long sched_clock(void)
>  	return mulhdu(get_tb() - boot_tb, tb_to_ns_scale) << tb_to_ns_shift;
>  }
>  
> +unsigned long long running_clock(void)

Non-kvm kernels don't need this code.  Is there some appropriate
"#ifdef CONFIG_foo" we can wrap this in?


> +{
> +	/*
> +	 * Don't read the VTB as a host since KVM does not switch in host timebase
> +	 * into the VTB when it takes a guest off the CPU, reading the VTB would
> +	 * result in reading 'last switched out' guest VTB.
> +	 */
> +
> +	if (firmware_has_feature(FW_FEATURE_LPAR)) {
> +		if (cpu_has_feature(CPU_FTR_ARCH_207S))
> +			return mulhdu(get_vtb() - boot_tb, tb_to_ns_scale) << tb_to_ns_shift;
> +
> +		/* This is a next best approximation without a VTB. */
> +		return sched_clock() - cputime_to_nsecs(kcpustat_this_cpu->cpustat[CPUTIME_STEAL]);

Why is this result dependent on FW_FEATURE_LPAR?  It's all generic code.

In fact the kernel/sched/clock.c default implementation of
running_clock() could use this expression.  Would that be good or bad? :)

> +	}
> +
> +	/*
> +	 * On a host which doesn't do any virtualisation TB *should* equal VTB so
> +	 * it makes no difference anyway.
> +	 */
> +
> +	return sched_clock();
> +}

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ