[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <1287672271.3488.139.camel@twins>
Date: Thu, 21 Oct 2010 16:44:31 +0200
From: Peter Zijlstra <peterz@...radead.org>
To: Venkatesh Pallipadi <venki@...gle.com>
Cc: Ingo Molnar <mingo@...e.hu>, "H. Peter Anvin" <hpa@...or.com>,
Thomas Gleixner <tglx@...utronix.de>,
Balbir Singh <balbir@...ux.vnet.ibm.com>,
Martin Schwidefsky <schwidefsky@...ibm.com>,
linux-kernel@...r.kernel.org, Paul Turner <pjt@...gle.com>,
Eric Dumazet <eric.dumazet@...il.com>,
Shaun Ruffell <sruffell@...ium.com>,
Yong Zhang <yong.zhang0@...il.com>
Subject: Re: [PATCH 4/6] Export ns irqtimes from IRQ_TIME_ACCOUNTING
through /proc/stat
On Wed, 2010-10-20 at 15:49 -0700, Venkatesh Pallipadi wrote:
> +static int irqtime_account_hi_update(void)
> +{
> + struct cpu_usage_stat *cpustat = &kstat_this_cpu.cpustat;
> + unsigned long flags;
> + u64 latest_ns;
> + int ret = 0;
> +
> + local_irq_save(flags);
> + latest_ns = __get_cpu_var(cpu_hardirq_time);
I guess this_cpu_read() would again be an improvement.. same for the SI
version.
> + if (cputime64_gt(nsecs_to_cputime64(latest_ns), cpustat->irq))
> + ret = 1;
> + local_irq_restore(flags);
> + return ret;
> +}
> +#ifdef CONFIG_IRQ_TIME_ACCOUNTING
> +/*
> + * Account a tick to a process and cpustat
> + * @p: the process that the cpu time gets accounted to
> + * @user_tick: is the tick from userspace
> + * @rq: the pointer to rq
> + *
> + * Tick demultiplexing follows the order
> + * - pending hardirq update
> + * - user_time
> + * - pending softirq update
> + * - idle_time
> + * - system time
> + * - check for guest_time
> + * - else account as system_time
> + *
> + * Check for hardirq is done both for system and user time as there is
> + * no timer going off while we are on hardirq and hence we may never get an
> + * oppurtunity to update it solely in system time.
My mailer suggests you spell that as: opportunity :-)
> + * p->stime and friends are only updated on system time and not on irq
> + * softirq as those do not count in task exec_runtime any more.
> + */
> +static void irqtime_account_process_tick(struct task_struct *p, int user_tick,
> + struct rq *rq)
> +{
> + cputime_t one_jiffy_scaled = cputime_to_scaled(cputime_one_jiffy);
> + cputime64_t tmp = cputime_to_cputime64(cputime_one_jiffy);
> + struct cpu_usage_stat *cpustat = &kstat_this_cpu.cpustat;
> +
> + if (irqtime_account_hi_update()) {
> + cpustat->irq = cputime64_add(cpustat->irq, tmp);
> + } else if (user_tick) {
> + account_user_time(p, cputime_one_jiffy, one_jiffy_scaled);
> + } else if (irqtime_account_si_update()) {
> + cpustat->softirq = cputime64_add(cpustat->softirq, tmp);
> + } else if (p == rq->idle) {
> + account_idle_time(cputime_one_jiffy);
> + } else if (p->flags & PF_VCPU) { /* System time or guest time */
> + account_guest_time(p, cputime_one_jiffy, one_jiffy_scaled);
> + } else {
> + __account_system_time(p, cputime_one_jiffy, one_jiffy_scaled,
> + &cpustat->system);
> + }
> +}
I'd do:
- hardirq
- softirq
- user
- system
- guest
- really system
- idle
Since otherwise tiny slices of softirq would need to wait for a system
tick to happen before you fold them.
Also, it is possible that in a single tick multiple counters overflow
the jiffy boundary, so something like:
if (irqtime_account_hi_update())
cpustat->irq = ...
if (irqtime_account_si_update())
cpustate->softirq = ...
if (user_tick) {
} else if (...) {
} else ...
would seem like the better approach.
> /*
> * Account for involuntary wait time.
> * @steal: the cpu time spent in involuntary wait
> @@ -3594,6 +3685,11 @@ void account_process_tick(struct task_struct *p, int user_tick)
> cputime_t one_jiffy_scaled = cputime_to_scaled(cputime_one_jiffy);
> struct rq *rq = this_rq();
>
> + if (sched_clock_irqtime) {
> + irqtime_account_process_tick(p, user_tick, rq);
> + return;
> + }
> +
> if (user_tick)
> account_user_time(p, cputime_one_jiffy, one_jiffy_scaled);
> else if ((p != rq->idle) || (irq_count() != HARDIRQ_OFFSET))
mark_tsc_unstable() can disable sched_clock_irqtime at any time, the
accounting won't go funny due to that right?
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists