lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Wed, 3 Jul 2019 07:06:38 -0700
From:   "Doug Smythies" <dsmythies@...us.net>
To:     "'Alan Jenkins'" <alan.christopher.jenkins@...il.com>
Cc:     <linux-pm@...r.kernel.org>, <linux-kernel@...r.kernel.org>
Subject: RE: NO_HZ_IDLE causes consistently low cpu "iowait" time (and higher cpu "idle" time)

On 2019.07.01 08:34 Alan Jenkins wrote:

> Hi

Hi,

> I tried running a simple test:
>
>     dd if=testfile iflag=direct bs=1M of=/dev/null
>
> With my default settings, `vmstat 10` shows something like 85% idle time 
> to 15% iowait time. I have 4 CPUs, so this is much less than one CPU 
> worth of iowait time.
>
> If I boot with "nohz=off", I see idle time fall to 75% or below, and 
> iowait rise to about 25%, equivalent to one CPU.  That is what I had 
> originally expected.
>
> (I can also see my expected numbers, if I disable *all* C-states and 
> force polling using `pm_qos_resume_latency_us` in sysfs).
>
> The numbers above are from a kernel somewhere around v5.2-rc5.  I saw 
> the "wrong" results on some previous kernels as well.  I just now 
> realized the link to NO_HZ_IDLE.[1]
>
> [1] 
> https://unix.stackexchange.com/questions/517757/my-basic-assumption-about-system-iowait-does-not-hold/527836#527836
>
> I did not find any information about this high level of inaccuracy. Can 
> anyone explain, is this behaviour expected?

I'm not commenting on expected behaviour or not, just that it is
inconsistent.

>
> I found several patches that mentioned "iowait" and NO_HZ_IDLE. But if 
> they described this problem, it was not clear to me.
>
> I thought this might also be affecting the "IO pressure" values from the 
> new "pressure stall information"... but I am too confused already, so I 
> am only asking about iowait at the moment :-).

Using your workload, I confirm inconsistent behaviour for /proc/stat
(which vmstat uses) between kernels 4.15, 4.16, and 4.17:
4.15 does what you expect, no matter idle states enabled or disabled.
4.16 doesn't do what you expect regardless. (although a little erratic.)
>= 4.17 does what you expect with only idle state 0 enabled, and doesn't otherwise.

Actual test data vmstat (/proc/stat) (8 CPUs, 12.5% = 1 CPU)):
Kernel	idle/iowait %	Idle states >= 1
4.15		88/12			enabled
4.15		88/12			disabled
4.16		99/1			enabled
4.16		99/1			disabled
4.17		98/2			enabled
4.17		88/12			disabled

Note 1: I never booted with "nohz=off" because the tick never turns off for
idle state 0, which is good enough for testing.

Note 2: Myself, I don't use /proc/stat for idle time statistics. I use:
/sys/devices/system/cpu/cpu*/cpuidle/state*/time
And they seem to always be consistent at the higher idle percentage number.

Unless someone has some insight, the next step is kernel bisection,
once for between kernel 4.15 and 4.16, then again between 4.16 and 4.17.
The second bisection might go faster with knowledge gained from the first.
Alan: Can you do kernel bisection? I can only do it starting maybe Friday.

... Doug


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ