[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <56DBAF6D.5010503@mail.usask.ca>
Date:	Sat, 05 Mar 2016 22:17:49 -0600
From:	Chris Friesen <cbf123@...l.usask.ca>
To:	Frederic Weisbecker <fweisbec@...il.com>,
	Thomas Gleixner <tglx@...utronix.de>
CC:	John Stultz <john.stultz@...aro.org>,
	Daniel Lezcano <daniel.lezcano@...aro.org>,
	lkml <linux-kernel@...r.kernel.org>,
	Peter Zijlstra <peterz@...radead.org>,
	Ingo Molnar <mingo@...nel.org>, Rik van Riel <riel@...hat.com>
Subject: Re: [PATCH] steal_account_process_tick() should return jiffies
On 03/05/2016 07:19 AM, Frederic Weisbecker wrote:
> On Sat, Mar 05, 2016 at 11:27:01AM +0100, Thomas Gleixner wrote:
>> Chris,
>>
>> On Fri, 4 Mar 2016, Chris Friesen wrote:
>>
>> First of all the subject line should contain a subsystem prefix,
>> i.e. "sched/cputime:"
>>
>>> The callers of steal_account_process_tick() expect it to return whether
>>> the last jiffy was stolen or not.
>>>
>>> Currently the return value of steal_account_process_tick() is in units
>>> of cputime, which vary between either jiffies or nsecs depending on
>>> CONFIG_VIRT_CPU_ACCOUNTING_GEN.
>>
>> Sure, but what is the actual problem? The return value is boolean and tells
>> whether there was stolen time accounted or not.
> Indeed the changelog should better explain the problem. So I think the issue is that
> if the cputime has nsecs granularity and we have a tiny stolen time to account (lets say
> a few nanosecs, in fact anything that is below a jiffy), we are not going to account the
> tick on user/system.
Yes, this is exactly it.  Because of this, if CONFIG_VIRT_CPU_ACCOUNTING_GEN is 
enabled in a guest then the idle/system/user stats in /proc/stat can show odd 
values, and "top" shows nothing for user/system even if CPU hogs are running.
> But the fix doesn't look right to me because we are still accounting the steal time
> if it is lower than a jiffy and that steal time will never be substracted to user/system
> time if it never reach a jiffy.
>
> Instead the fix should accumulate the steal time and account it only once it's worth
> a jiffy and then substract it from system/user time accordingly.
Yes, on reflection you are correct, and the patch looks pretty close, except 
that account_steal_time() is still expecting units of cputime.  I'll send a 
followup patch.
 > Something like that:
>
> diff --git a/kernel/sched/cputime.c b/kernel/sched/cputime.c
> index b2ab2ff..d38e25f 100644
> --- a/kernel/sched/cputime.c
> +++ b/kernel/sched/cputime.c
> @@ -262,7 +262,7 @@ static __always_inline bool steal_account_process_tick(void)
>   #ifdef CONFIG_PARAVIRT
>   	if (static_key_false(¶virt_steal_enabled)) {
>   		u64 steal;
> -		cputime_t steal_ct;
> +		unsigned long steal_jiffies;
>
>   		steal = paravirt_steal_clock(smp_processor_id());
>   		steal -= this_rq()->prev_steal_time;
> @@ -272,11 +272,11 @@ static __always_inline bool steal_account_process_tick(void)
>   		 * based on jiffies). Lets cast the result to cputime
>   		 * granularity and account the rest on the next rounds.
>   		 */
> -		steal_ct = nsecs_to_cputime(steal);
> -		this_rq()->prev_steal_time += cputime_to_nsecs(steal_ct);
> +		steal_jiffies = nsecs_to_jiffies(steal);
> +		this_rq()->prev_steal_time += jiffies_to_nsecs(steal_jiffies);
>
>   		account_steal_time(steal_ct);
> -		return steal_ct;
> +		return steal_jiffies;
>   	}
>   #endif
>   	return false;
>
Powered by blists - more mailing lists
 
