lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Date:   Thu, 24 Sep 2020 23:19:53 +0200
From:   Thomas Gleixner <tglx@...utronix.de>
To:     Tom Hromatka <tom.hromatka@...cle.com>, tom.hromatka@...cle.com,
        linux-kernel@...r.kernel.org, linux-fsdevel@...r.kernel.org,
        fweisbec@...il.com, mingo@...nel.org, adobriyan@...il.com
Subject: Re: [PATCH v2 2/2] /proc/stat: Simplify iowait and idle calculations when cpu is offline

On Tue, Sep 15 2020 at 13:36, Tom Hromatka wrote:
> Prior to this commit, the cpu idle and iowait data in /proc/stat used
> different data sources based upon whether the CPU was online or not.
> This would cause spikes in the cpu idle and iowait data.

This would not cause spikes. It _causes_ these times to go backwards and
start over from 0. That's something completely different than a spike.

Please describe problems precisely.

> This patch uses the same data source, get_cpu_{idle,iowait}_time_us(),
> whether the CPU is online or not.
>
> This patch and the preceding patch, "tick-sched: Do not clear the
> iowait and idle times", ensure that the cpu idle and iowait data
> are always increasing.

So now you have a mixture of 'This commit and this patch'. Oh well.

Aside of that the ordering of your changelog is backwards. Something
like this:

   The CPU idle and iowait times in /proc/stats are inconsistent accross
   CPU hotplug.

   The reason is that for NOHZ active systems the core accounting of CPU
   idle and iowait times used to be reset when a CPU was unplugged. The
   /proc/stat code tries to work around that by using the corresponding
   member of kernel_cpustat when the CPU is offline.

   This works as long as the CPU stays offline, but when it is onlined
   again then the accounting is taken from the NOHZ core data again
   which started over from 0 causing both times to go backwards.

   The HOHZ core has been fixed to preserve idle and iowait times
   accross CPU unplug, so the broken workaround is not longer required.

Hmm?

But...

> --- a/fs/proc/stat.c
> +++ b/fs/proc/stat.c
> @@ -47,34 +47,12 @@ static u64 get_iowait_time(struct kernel_cpustat *kcs, int cpu)
>  
>  static u64 get_idle_time(struct kernel_cpustat *kcs, int cpu)
>  {
> -	u64 idle, idle_usecs = -1ULL;
> -
> -	if (cpu_online(cpu))
> -		idle_usecs = get_cpu_idle_time_us(cpu, NULL);
> -
> -	if (idle_usecs == -1ULL)
> -		/* !NO_HZ or cpu offline so we can rely on cpustat.idle */
> -		idle = kcs->cpustat[CPUTIME_IDLE];
> -	else
> -		idle = idle_usecs * NSEC_PER_USEC;
> -
> -	return idle;
> +	return get_cpu_idle_time_us(cpu, NULL) * NSEC_PER_USEC;

Q: How is this supposed to work on !NO_HZ systems or in case that NOHZ
   has been disabled at boot time via command line option or lack of
   hardware?

A: Not at all.

Hint #1: You removed the following comment:

	/* !NO_HZ or cpu offline so we can rely on cpustat.idle */

Hint #2: There is more than one valid kernel configuration.
'
Hint #3: Command line options and hardware features have side effects

Hint #4: git grep 'get_cpu_.*_time_us' 

Thanks,

        tglx

Powered by blists - more mailing lists