[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <20100324204554.GA31780@redhat.com>
Date:	Wed, 24 Mar 2010 21:45:54 +0100
From:	Oleg Nesterov <oleg@...hat.com>
To:	Andrew Morton <akpm@...ux-foundation.org>,
	Americo Wang <xiyou.wangcong@...il.com>,
	Balbir Singh <balbir@...ibm.com>,
	"Eric W. Biederman" <ebiederm@...ssion.com>,
	Hidetoshi Seto <seto.hidetoshi@...fujitsu.com>,
	Ingo Molnar <mingo@...e.hu>,
	Peter Zijlstra <peterz@...radead.org>,
	Roland McGrath <roland@...hat.com>,
	Spencer Candland <spencer@...ehost.com>,
	Stanislaw Gruszka <sgruszka@...hat.com>
Cc:	linux-kernel@...r.kernel.org
Subject: [RFC,PATCH 2/2] cputimers/proc:
	do_task_stat()->thread_group_times() is racy and O(n) under
	->siglock
Nowadays ->siglock is overloaded, it would be really nice to change
do_task_stat() to walk through the list of threads lockless. And note
that we are doing while_each_thread() twice!
while_each_thread() is rcu-safe, but thread_group_times() also needs
->siglock to serialize the modifications of signal_struct->prev_Xtime
members.
(however, please note that currently do_task_stat() can race with
 wait_task_zombie() which calls thread_group_times() without siglock).
This patch changes the code back to use thread_group_cputime(), as we
did before 0cf55e1e "sched, cputime: Introduce thread_group_times()".
Of course, this makes the output from /proc/pid/stat less accurate, but
otoh this allows us to make do_task_stat() (apart from ->tty bits).
Signed-off-by: Oleg Nesterov <oleg@...hat.com>
---
 fs/proc/array.c |    6 +++++-
 1 file changed, 5 insertions(+), 1 deletion(-)
--- 34-rc1/fs/proc/array.c~PROC_5_DTS_TGTS_IS_PITA	2010-03-24 20:47:19.000000000 +0100
+++ 34-rc1/fs/proc/array.c	2010-03-24 20:47:51.000000000 +0100
@@ -403,6 +403,7 @@ static int do_task_stat(struct seq_file 
 
 	if (lock_task_sighand(task, &flags)) {
 		struct signal_struct *sig = task->signal;
+		struct task_cputime cputime;
 
 		if (sig->tty) {
 			struct pid *pgrp = tty_get_pgrp(sig->tty);
@@ -433,8 +434,11 @@ static int do_task_stat(struct seq_file 
 
 			min_flt += sig->min_flt;
 			maj_flt += sig->maj_flt;
-			thread_group_times(task, &utime, &stime);
+
 			gtime = cputime_add(gtime, sig->gtime);
+			thread_group_cputime(task, &cputime);
+			utime = cputime.utime;
+			stime = cputime.stime;
 		}
 
 		sid = task_session_nr_ns(task, ns);
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/
Powered by blists - more mailing lists
 
