[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <bc2f8d97-9462-125f-9fa2-49044c244479@gmail.com>
Date: Wed, 3 Jul 2019 17:09:07 +0100
From: Alan Jenkins <alan.christopher.jenkins@...il.com>
To: Doug Smythies <dsmythies@...us.net>
Cc: linux-pm@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: NO_HZ_IDLE causes consistently low cpu "iowait" time (and higher
cpu "idle" time)
On 03/07/2019 15:06, Doug Smythies wrote:
> On 2019.07.01 08:34 Alan Jenkins wrote:
>
>> Hi
> Hi,
>
>> I tried running a simple test:
>>
>> dd if=testfile iflag=direct bs=1M of=/dev/null
>>
>> With my default settings, `vmstat 10` shows something like 85% idle time
>> to 15% iowait time. I have 4 CPUs, so this is much less than one CPU
>> worth of iowait time.
>>
>> If I boot with "nohz=off", I see idle time fall to 75% or below, and
>> iowait rise to about 25%, equivalent to one CPU. That is what I had
>> originally expected.
>>
>> (I can also see my expected numbers, if I disable *all* C-states and
>> force polling using `pm_qos_resume_latency_us` in sysfs).
>>
>> The numbers above are from a kernel somewhere around v5.2-rc5. I saw
>> the "wrong" results on some previous kernels as well. I just now
>> realized the link to NO_HZ_IDLE.[1]
>>
>> [1]
>> https://unix.stackexchange.com/questions/517757/my-basic-assumption-about-system-iowait-does-not-hold/527836#527836
>>
>> I did not find any information about this high level of inaccuracy. Can
>> anyone explain, is this behaviour expected?
> I'm not commenting on expected behaviour or not, just that it is
> inconsistent.
>
>> I found several patches that mentioned "iowait" and NO_HZ_IDLE. But if
>> they described this problem, it was not clear to me.
>>
>> I thought this might also be affecting the "IO pressure" values from the
>> new "pressure stall information"... but I am too confused already, so I
>> am only asking about iowait at the moment :-).
> Using your workload, I confirm inconsistent behaviour for /proc/stat
> (which vmstat uses) between kernels 4.15, 4.16, and 4.17:
> 4.15 does what you expect, no matter idle states enabled or disabled.
> 4.16 doesn't do what you expect regardless. (although a little erratic.)
>> = 4.17 does what you expect with only idle state 0 enabled, and doesn't otherwise.
> Actual test data vmstat (/proc/stat) (8 CPUs, 12.5% = 1 CPU)):
> Kernel idle/iowait % Idle states >= 1
> 4.15 88/12 enabled
> 4.15 88/12 disabled
> 4.16 99/1 enabled
> 4.16 99/1 disabled
> 4.17 98/2 enabled
> 4.17 88/12 disabled
>
> Note 1: I never booted with "nohz=off" because the tick never turns off for
> idle state 0, which is good enough for testing.
>
> Note 2: Myself, I don't use /proc/stat for idle time statistics. I use:
> /sys/devices/system/cpu/cpu*/cpuidle/state*/time
> And they seem to always be consistent at the higher idle percentage number.
>
> Unless someone has some insight, the next step is kernel bisection,
> once for between kernel 4.15 and 4.16, then again between 4.16 and 4.17.
> The second bisection might go faster with knowledge gained from the first.
> Alan: Can you do kernel bisection? I can only do it starting maybe Friday.
>
> ... Doug
Thanks for your help Doug!
I wish I had a faster CPU :-), but I'm familiar with bisection. I have
started and I'm down to about 8 minute builds, so I can probably be done
before Friday.
Alan
Powered by blists - more mailing lists