[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <483AF25B.6090806@linux.vnet.ibm.com>
Date: Mon, 26 May 2008 22:54:43 +0530
From: Balbir Singh <balbir@...ux.vnet.ibm.com>
To: Arjan van de Ven <arjan@...radead.org>
CC: Vaidyanathan Srinivasan <svaidy@...ux.vnet.ibm.com>,
Linux Kernel <linux-kernel@...r.kernel.org>,
venkatesh.pallipadi@...el.com, suresh.b.siddha@...el.com,
Michael Neuling <mikey@...ling.org>,
"Amit K. Arora" <aarora@...ux.vnet.ibm.com>
Subject: Re: [RFC PATCH v1 0/3] Scaled statistics using APERF/MPERF in x86
Arjan van de Ven wrote:
> On Mon, 26 May 2008 20:01:33 +0530
> Vaidyanathan Srinivasan <svaidy@...ux.vnet.ibm.com> wrote:
>
>> The following RFC patch tries to implement scaled CPU utilisation
>> statistics using APERF and MPERF MSR registers in an x86 platform.
>>
>> The CPU capacity is significantly changed when the CPU's frequency is
>> reduced for the purpose of power savings. The applications that run
>> at such lower CPU frequencies are also accounted for real CPU time by
>> default. If the applications have been run at full CPU frequency,
>> they would have finished the work faster and not get charged for
>> excessive CPU time.
>>
>> One of the solution to this problem it so scale the utime and stime
>> entitlement for the process as per the current CPU frequency. This
>> technique is used in powerpc architecture with the help of hardware
>> registers that accurately capture the entitlement.
>>
>
> there are some issues with this unfortunately, and these make it
> a very complex thing to do.
> Just to mention a few:
> 1) What if the BIOS no longer allows us to go to the max frequency for
> a period (for example as a result of overheating); with the approach
> above, the admin would THINK he can go faster, but he cannot in reality,
> so there's misleading information (the system looks half busy, while in
> reality it's actually the opposite, it's overloaded). Management tools
> will take the wrong decisions (such as moving MORE work to the box, not
> less)
> 2) On systems with Intel Dynamic Acceleration technology, you can get
> over 100% of cycles this way. (For those who don't know what IDA is;
> IDA is basically a case where if your Penryn based dual core laptop is
> only using 1 core, the other core can go faster than 100% as long as
> thermals etc allow it). How do you want to deal with this?
Arjan,
These problems exist anyway, irrespective of scaled accounting (I'd say that
they are exceptions)
1. The management tool does have access to the current frequency and maximum
frequency, irrespective of scaled accounting. The decision could still be taken
on the data that is already available and management tools can already use them
2. With IDA, we'd have to document that APERF/MPERF can be greater than 100% if
the system is overclocked.
Scaled accounting only intends to provide data already available. Interpretation
is left to management tools and we'll document the corner cases that you just
mentioned.
--
Warm Regards,
Balbir Singh
Linux Technology Center
IBM, ISTL
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists