linux-kernel - Re: [PATCH 4/6] Export ns irqtimes from IRQ_TIME

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <1287672271.3488.139.camel@twins>
Date:	Thu, 21 Oct 2010 16:44:31 +0200
From:	Peter Zijlstra <peterz@...radead.org>
To:	Venkatesh Pallipadi <venki@...gle.com>
Cc:	Ingo Molnar <mingo@...e.hu>, "H. Peter Anvin" <hpa@...or.com>,
	Thomas Gleixner <tglx@...utronix.de>,
	Balbir Singh <balbir@...ux.vnet.ibm.com>,
	Martin Schwidefsky <schwidefsky@...ibm.com>,
	linux-kernel@...r.kernel.org, Paul Turner <pjt@...gle.com>,
	Eric Dumazet <eric.dumazet@...il.com>,
	Shaun Ruffell <sruffell@...ium.com>,
	Yong Zhang <yong.zhang0@...il.com>
Subject: Re: [PATCH 4/6] Export ns irqtimes from IRQ_TIME_ACCOUNTING
 through /proc/stat

On Wed, 2010-10-20 at 15:49 -0700, Venkatesh Pallipadi wrote:

> +static int irqtime_account_hi_update(void)
> +{
> +	struct cpu_usage_stat *cpustat = &kstat_this_cpu.cpustat;
> +	unsigned long flags;
> +	u64 latest_ns;
> +	int ret = 0;
> +
> +	local_irq_save(flags);
> +	latest_ns = __get_cpu_var(cpu_hardirq_time);

I guess this_cpu_read() would again be an improvement.. same for the SI
version.

> +	if (cputime64_gt(nsecs_to_cputime64(latest_ns), cpustat->irq))
> +		ret = 1;
> +	local_irq_restore(flags);
> +	return ret;
> +}

> +#ifdef CONFIG_IRQ_TIME_ACCOUNTING
> +/*
> + * Account a tick to a process and cpustat
> + * @p: the process that the cpu time gets accounted to
> + * @user_tick: is the tick from userspace
> + * @rq: the pointer to rq
> + *
> + * Tick demultiplexing follows the order
> + * - pending hardirq update
> + * - user_time
> + * - pending softirq update
> + * - idle_time
> + * - system time
> + *   - check for guest_time
> + *   - else account as system_time
> + *
> + * Check for hardirq is done both for system and user time as there is
> + * no timer going off while we are on hardirq and hence we may never get an
> + * oppurtunity to update it solely in system time.

My mailer suggests you spell that as: opportunity :-)

> + * p->stime and friends are only updated on system time and not on irq
> + * softirq as those do not count in task exec_runtime any more.
> + */
> +static void irqtime_account_process_tick(struct task_struct *p, int user_tick,
> +						struct rq *rq)
> +{
> +	cputime_t one_jiffy_scaled = cputime_to_scaled(cputime_one_jiffy);
> +	cputime64_t tmp = cputime_to_cputime64(cputime_one_jiffy);
> +	struct cpu_usage_stat *cpustat = &kstat_this_cpu.cpustat;
> +
> +	if (irqtime_account_hi_update()) {
> +		cpustat->irq = cputime64_add(cpustat->irq, tmp);
> +	} else if (user_tick) {
> +		account_user_time(p, cputime_one_jiffy, one_jiffy_scaled);
> +	} else if (irqtime_account_si_update()) {
> +		cpustat->softirq = cputime64_add(cpustat->softirq, tmp);
> +	} else if (p == rq->idle) {
> +		account_idle_time(cputime_one_jiffy);
> +	} else if (p->flags & PF_VCPU) { /* System time or guest time */
> +		account_guest_time(p, cputime_one_jiffy, one_jiffy_scaled);
> +	} else {
> +		__account_system_time(p, cputime_one_jiffy, one_jiffy_scaled,
> +					&cpustat->system);
> +	}
> +}

I'd do:

  - hardirq
  - softirq
  - user
  - system
     - guest
     - really system
  - idle

Since otherwise tiny slices of softirq would need to wait for a system
tick to happen before you fold them.

Also, it is possible that in a single tick multiple counters overflow
the jiffy boundary, so something like:

  if (irqtime_account_hi_update())
	cpustat->irq = ...

  if (irqtime_account_si_update())
	cpustate->softirq = ...

  if (user_tick) {
  } else if (...) {

  } else ...

would seem like the better approach.

>  /*
>   * Account for involuntary wait time.
>   * @steal: the cpu time spent in involuntary wait
> @@ -3594,6 +3685,11 @@ void account_process_tick(struct task_struct *p, int user_tick)
>  	cputime_t one_jiffy_scaled = cputime_to_scaled(cputime_one_jiffy);
>  	struct rq *rq = this_rq();
>  
> +	if (sched_clock_irqtime) {
> +		irqtime_account_process_tick(p, user_tick, rq);
> +		return;
> +	}
> +
>  	if (user_tick)
>  		account_user_time(p, cputime_one_jiffy, one_jiffy_scaled);
>  	else if ((p != rq->idle) || (irq_count() != HARDIRQ_OFFSET))

mark_tsc_unstable() can disable sched_clock_irqtime at any time, the
accounting won't go funny due to that right?
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/