[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20080127165400.GB1044@linux.vnet.ibm.com>
Date: Sun, 27 Jan 2008 22:24:00 +0530
From: Srivatsa Vaddagiri <vatsa@...ux.vnet.ibm.com>
To: Toralf.Förster <toralf.foerster@....de>@snowy.in.ibm.com
Cc: Tomasz Chmielewski <mangoo@...g.org>, linux-kernel@...r.kernel.org,
Ingo Molnar <mingo@...e.hu>, a.p.zijlstra@...llo.nl,
dhaval@...ux.vnet.ibm.com
Subject: Re: (ondemand) CPU governor regression between 2.6.23 and 2.6.24
On Sun, Jan 27, 2008 at 04:06:17PM +0100, Toralf Förster wrote:
> > The third line (giving overall cpu usage stats) is what is interesting here.
> > If you have more than one cpu, you can get cpu usage stats for each cpu
> > in top by pressing 1. Can you provide this information with and w/o
> > CONFIG_FAIR_GROUP_SCHED?
>
> This is what I get if I set CONFIG_FAIR_GROUP_SCHED to "y"
>
> top - 16:00:59 up 2 min, 1 user, load average: 2.56, 1.60, 0.65
> Tasks: 84 total, 3 running, 81 sleeping, 0 stopped, 0 zombie
> Cpu(s): 49.7%us, 0.3%sy, 49.7%ni, 0.0%id, 0.0%wa, 0.3%hi, 0.0%si, 0.0%st
> Mem: 1036180k total, 322876k used, 713304k free, 13164k buffers
> Swap: 997880k total, 0k used, 997880k free, 149208k cached
>
> PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
> 6070 dnetc 39 19 664 348 264 R 49.7 0.0 1:09.71 dnetc
> 6676 tfoerste 20 0 1796 488 428 R 49.3 0.0 0:02.72 factor
>
> Stopping dnetc gives:
>
> top - 16:02:36 up 4 min, 1 user, load average: 2.50, 1.87, 0.83
> Tasks: 89 total, 3 running, 86 sleeping, 0 stopped, 0 zombie
> Cpu(s): 99.3%us, 0.7%sy, 0.0%ni, 0.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
> Mem: 1036180k total, 378760k used, 657420k free, 14736k buffers
> Swap: 997880k total, 0k used, 997880k free, 180868k cached
>
> PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
> 6766 tfoerste 20 0 1796 488 428 R 84.9 0.0 0:05.41 factor
Thanks for this respone. This confirms that cpu's idle time is close to
zero, as I intended to verify.
> > If I am not mistaken, cpu ondemand gov goes by the cpu idle time stats,
> > which should not be affected by FAIR_GROUP_SCHED. I will lookaround for
> > other possible causes.
On further examination, ondemand governor seems to have a tunable to
ignore nice load. In your case, I see that dnetc is running at a
positive nice value (19) which could explain why ondemand gov thinks
that the cpu is only ~50% loaded.
Can you check what is the setting of this knob in your case?
# cat /sys/devices/system/cpu/cpu0/cpufreq/ondemand/ignore_nice_load
You can set that to 0 to ask ondemand gov to include nice load into
account while calculating cpu freq changes:
# echo 0 > /sys/devices/system/cpu/cpu0/cpufreq/ondemand/ignore_nice_load
This should restore the behavior of ondemand governor as seen in 2.6.23
in your case (even with CONFIG_FAIR_GROUP_SCHED enabled). Can you pls confirm
if that happens?
> As I stated our in http://lkml.org/lkml/2008/1/26/207 the issue is solved
> after unselecting FAIR_GROUP_SCHED.
I understand, but we want to keep CONFIG_FAIR_GROUP_SCHED enabled by
default.
Ingo,
Most folks seem to be used to a global nice-domain, where a nice 19
task gives up cpu in competetion to a nice-0 task (irrespective of which
userid's they belong to). CONFIG_FAIR_USER_SCHED brings noticeable changes wrt
that. We could possibly let it be as it is (since that is what a server
admin may possibly want when managing university servers) or modify it to be
aware of nice-level (priority of user-sched entity is equivalent to highest
prio task it has).
In any case, I will send across a patch to turn off CONFIG_FAIR_USER_SCHED by
default (and instead turn on CONFIG_FAIR_CGROUP_SCHED by default).
--
Regards,
vatsa
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists