lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Mon, 27 Jun 2016 08:50:06 -0400
From:	Rik van Riel <riel@...hat.com>
To:	Frederic Weisbecker <fweisbec@...il.com>
Cc:	linux-kernel@...r.kernel.org, peterz@...radead.org,
	mingo@...nel.org, pbonzini@...hat.com, fweisbec@...hat.com,
	wanpeng.li@...mail.com, efault@....de, tglx@...utronix.de,
	rkrcmar@...hat.com
Subject: Re: [PATCH 1/5] sched,time: count actually elapsed irq & softirq
 time

On Mon, 2016-06-27 at 14:25 +0200, Frederic Weisbecker wrote:
> On Wed, Jun 22, 2016 at 10:25:47PM -0400, riel@...hat.com wrote:
> > 
> > From: Rik van Riel <riel@...hat.com>
> > 
> > Currently, if there was any irq or softirq time during 'ticks'
> > jiffies, the entire period will be accounted as irq or softirq
> > time.
> > 
> > This is inaccurate if only a subset of 'ticks' jiffies was
> > actually spent handling irqs, and could conceivably mis-count
> > all of the ticks during a period as irq time, when there was
> > some irq and some softirq time.
> Good catch!
> 
> Many comments following.
> 
> > 
> > 
> > This can actually happen when irqtime_account_process_tick
> > is called from account_idle_ticks, which can pass a larger
> > number of ticks down all at once.
> > 
> > Fix this by changing irqtime_account_hi_update and
> > irqtime_account_si_update to round elapsed irq and softirq
> > time to jiffies, and return the number of jiffies spent in
> > each mode, similar to how steal time is handled.
> > 
> > Additionally, have irqtime_account_process_tick take into
> > account how much time was spent in each of steal, irq,
> > and softirq time.
> > 
> > The latter could help improve the accuracy of timekeeping
> Maybe you meant cputime? Timekeeping is rather about jiffies and
> GTOD.
> 
> > 
> > when returning from idle on a NO_HZ_IDLE CPU.
> > 
> > Properly accounting how much time was spent in hardirq and
> > softirq time will also allow the NO_HZ_FULL code to re-use
> > these same functions for hardirq and softirq accounting.
> > 
> > Signed-off-by: Rik van Riel <riel@...hat.com>
> > 
> >  	local_irq_save(flags);
> > -	latest_ns = this_cpu_read(cpu_hardirq_time);
> > -	if (nsecs_to_cputime64(latest_ns) > cpustat[CPUTIME_IRQ])
> > -		ret = 1;
> > +	irq = this_cpu_read(cpu_hardirq_time) -
> > cpustat[CPUTIME_IRQ];
> cpu_hardirq_time is made of nsecs whereas cpustat is of cputime_t (in
> fact
> even cputime64_t). So you need to convert cpu_hardirq_time before
> doing the
> substract.

Doh. Good catch!

> > -static int irqtime_account_si_update(void)
> > +static unsigned long irqtime_account_si_update(unsigned long
> > max_jiffies)
> >  {
> >  	u64 *cpustat = kcpustat_this_cpu->cpustat;
> > +	unsigned long si_jiffies = 0;
> >  	unsigned long flags;
> > -	u64 latest_ns;
> > -	int ret = 0;
> > +	u64 softirq;
> >  
> >  	local_irq_save(flags);
> > -	latest_ns = this_cpu_read(cpu_softirq_time);
> > -	if (nsecs_to_cputime64(latest_ns) >
> > cpustat[CPUTIME_SOFTIRQ])
> > -		ret = 1;
> > +	softirq = this_cpu_read(cpu_softirq_time) -
> > cpustat[CPUTIME_SOFTIRQ];
> > +	if (softirq > cputime_one_jiffy) {
> > +		si_jiffies = min(max_jiffies,
> > cputime_to_jiffies(softirq));
> > +		cpustat[CPUTIME_SOFTIRQ] +=
> > jiffies_to_cputime(si_jiffies);
> > +	}
> >  	local_irq_restore(flags);
> > -	return ret;
> > +	return si_jiffies;
> So same comments apply here.
> 
> [...]
> > 
> >   * Accumulate raw cputime values of dead tasks (sig->[us]time) and
> > live
> >   * tasks (sum on group iteration) belonging to @tsk's group.
> >   */
> > @@ -344,19 +378,24 @@ static void
> > irqtime_account_process_tick(struct task_struct *p, int user_tick,
> >  {
> >  	cputime_t scaled = cputime_to_scaled(cputime_one_jiffy);
> >  	u64 cputime = (__force u64) cputime_one_jiffy;
> > -	u64 *cpustat = kcpustat_this_cpu->cpustat;
> > +	unsigned long other;
> >  
> > -	if (steal_account_process_tick(ULONG_MAX))
> > +	/*
> > +	 * When returning from idle, many ticks can get accounted
> > at
> > +	 * once, including some ticks of steal, irq, and softirq
> > time.
> > +	 * Subtract those ticks from the amount of time accounted
> > to
> > +	 * idle, or potentially user or system time. Due to
> > rounding,
> > +	 * other time can exceed ticks occasionally.
> > +	 */
> > +	other = account_other_ticks(ticks);
> > +	if (other >= ticks)
> >  		return;
> > +	ticks -= other;
> >  
> >  	cputime *= ticks;
> >  	scaled *= ticks;
> So instead of dealing with ticks here, I think you should rather use
> the above
> cputime as both the limit and the remaining time to account after
> steal/irqs.
> 
> This should avoid some middle conversions and improve precision when
> cputime_t == nsecs granularity.
> 
> If we account 2 ticks to idle (lets say HZ=100) and irq time to
> account is 15 ms. 2 ticks = 20 ms
> so we have 5 ms left to account to idle. With the jiffies granularity
> in this patch, we would account
> one tick to irqtime (1 tick = 10 ms, there will be 5 ms to account
> back later) and one tick to idle
> time whereas if you deal with cputime_t, you are going to account the
> correct amount of idle time.
> 

Ahhh, so you want irqtime_account_process_tick to work with
and account fractional ticks when calling account_system_time,
account_user_time, account_idle_time, etc?

I guess that should work fine since we already pass cputime
values in, anyway.

I suppose we can do the same for get_vtime_delta, too.

They can both work with the actual remaining time (in cputime_t),
after the other time has been subtracted.

I can rework the series to do that.

-- 
All Rights Reversed.


Download attachment "signature.asc" of type "application/pgp-signature" (474 bytes)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ