[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20070524080959.GA29151@elte.hu>
Date:	Thu, 24 May 2007 10:09:59 +0200
From:	Ingo Molnar <mingo@...e.hu>
To:	Balbir Singh <balbir@...ux.vnet.ibm.com>
Cc:	linux-kernel@...r.kernel.org,
	Linus Torvalds <torvalds@...ux-foundation.org>,
	Andrew Morton <akpm@...ux-foundation.org>,
	Mike Galbraith <efault@....de>,
	Arjan van de Ven <arjan@...radead.org>,
	Thomas Gleixner <tglx@...utronix.de>,
	pranith-kumar_d@...torg.com, Andi Kleen <andi@...stfloor.org>
Subject: Re: [patch] CFS scheduler, -v14
* Balbir Singh <balbir@...ux.vnet.ibm.com> wrote:
> Hi, Ingo,
> 
> I've implemented a patch on top of v14 for better accounting of 
> sched_info statistics. Earlier, sched_info relied on jiffies for 
> accounting and I've seen applications that show "0" cpu usage 
> statistics (in delay accounting and from /proc) even though they've 
> been running on the CPU for a long time. The basic problem is that 
> accounting in jiffies is too coarse to be accurate.
> 
> The patch below uses sched_clock() for sched_info accounting.
nice! I've merged your patch and it built/booted fine so it should show 
up in -v15. This should also play well with Andi's sched_clock() 
enhancements in -mm, slated for .23.
btw., i think some more consolidation could be done in this area. We've 
now got the traditional /proc/PID/stat metrics, schedstats, taskstats 
and delay accounting and with CFS we've got /proc/sched_debug and 
/proc/PID/sched. There's a fair amount of overlap.
btw., CFS does this change to fs/proc/array.c:
@@ -410,6 +408,14 @@ static int do_task_stat(struct task_stru
 	/* convert nsec -> ticks */
 	start_time = nsec_to_clock_t(start_time);
 
+	/*
+	 * Use CFS's precise accounting, if available:
+	 */
+	if (!has_rt_policy(task)) {
+		utime = nsec_to_clock_t(task->sum_exec_runtime);
+		stime = 0;
+	}
+
 	res = sprintf(buffer,"%d (%s) %c %d %d %d %d %d %lu %lu \
 %lu %lu %lu %lu %lu %ld %ld %ld %ld %d 0 %llu %lu %ld %lu %lu %lu %lu %lu \
 %lu %lu %lu %lu %lu %lu %lu %lu %d %d %lu %lu %llu\n",
if you have some spare capacity to improve this code, it could be 
further enhanced by not setting 'stime' to zero, but using the existing 
jiffies based utime/stime statistics as a _ratio_ to split up the 
precise p->sum_exec_runtime. That way we dont have to add precise 
accounting to syscall entry/exit points (that would be quite expensive), 
but still the sum of utime+stime would be very precise. (and that's what 
matters most anyway)
	Ingo
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/
Powered by blists - more mailing lists
 
