[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Date: Thu, 04 Apr 2013 10:40:16 -0700
From: Dave Hansen <dave@...1.net>
To: linux-kernel@...r.kernel.org, Ingo Molnar <mingo@...hat.com>,
Peter Zijlstra <peterz@...radead.org>,
Hidetoshi Seto <seto.hidetoshi@...fujitsu.com>
Subject: sched/cputime: sig->prev_stime underflow
With the 3.9-rcs (and probably much earlier) I'm seeing some weird top
output where the cpu time "spent" is millions of hours:
445 root 20 0 0 0 0 S 0 0.0 5124095h kworker/45:1
404 root 20 0 0 0 0 S 0 0.0 5124095h kworker/4:1
I see it mostly with kernel threads, but it doesn't seem to happen on my
distro kernel (3.5 era). The suspect code is in thread_group_times():
sig->prev_stime = max(sig->prev_stime, rtime - sig->prev_utime);
In my case, I caught it with rtime=34 and sig->prev_utime=35. This code
_looks_ to be pretty mature, coming in at commit 0cf55e1e in 2009. The
system I'm running on _does_ have some non-sync'd TSCs, but they are at
least being detected, so I expect the fallout to be minimal:
tsc: Marking TSC unstable due to check_tsc_sync_source failed
config:
http://sr71.net/~dave/linux/config-bigbox-04042013.txt
The dumb fix here would seem to be to just check "rtime <
sig->prev_utime". Any thoughts?
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists