lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Thu, 14 Mar 2013 10:13:43 +0100
From:	Stanislaw Gruszka <sgruszka@...hat.com>
To:	Ingo Molnar <mingo@...nel.org>
Cc:	Frederic Weisbecker <fweisbec@...il.com>,
	LKML <linux-kernel@...r.kernel.org>,
	Steven Rostedt <rostedt@...dmis.org>,
	Peter Zijlstra <peterz@...radead.org>,
	Andrew Morton <akpm@...ux-foundation.org>
Subject: Re: [GIT PULL] sched: Cputime update for 3.10

On Thu, Mar 14, 2013 at 08:14:27AM +0100, Ingo Molnar wrote:
> 
> * Frederic Weisbecker <fweisbec@...il.com> wrote:
> 
> > Ingo,
> > 
> > Please pull the latest cputime accounting updates that can be found at:
> > 
> > git://git.kernel.org/pub/scm/linux/kernel/git/frederic/linux-dynticks.git
> > 	sched/core
> > 
> > HEAD: d9a3c9823a2e6a543eb7807fb3d15d8233817ec5
> > 
> > Some users are complaining that their threadgroup's runtime accounting 
> > freezes after a week or so of intense cpu-bound workload. This set tries 
> > to fix the issue by reducing the risk of multiplication overflow in the 
> > cputime scaling code.
> 
> Hm, is this a new bug? When was it introduced and is upstream affected as 
> well?

Commit 0cf55e1ec08bb5a22e068309e2d8ba1180ab4239 start to use scalling
for whole thread group, so increase chances of hitting multiplication
overflow, depending on how many CPUs are on the system.

We have multiplication utime * rtime for one thread since commit
b27f03d4bdc145a09fb7b0c0e004b29f1ee555fa. 

Overflow will happen after:

rtime * utime > 0xffffffffffffffff jiffies

if thread utilize 100% of CPU time, that gives:

rtime > sqrt(0xffffffffffffffff) jiffies

ritme > sqrt(0xffffffffffffffff) / (24 * 60 * 60 * HZ) days

For HZ 100 it will be 497 days for HZ 1000 it will be 49 days.

Bug affect only users, who run CPU intensive application for that
long period. Also they have to be interested on utime,stime values,
as bug has no other visible effect as making those values incorrect.

Stanislaw
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ