linux-kernel - Re: [RFC,PATCH 2/2] cputimers/proc: do_task_stat()->thread_group

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20100325121250.GA3664@redhat.com>
Date:	Thu, 25 Mar 2010 13:12:50 +0100
From:	Oleg Nesterov <oleg@...hat.com>
To:	Peter Zijlstra <peterz@...radead.org>
Cc:	Andrew Morton <akpm@...ux-foundation.org>,
	Americo Wang <xiyou.wangcong@...il.com>,
	Balbir Singh <balbir@...ibm.com>,
	"Eric W. Biederman" <ebiederm@...ssion.com>,
	Hidetoshi Seto <seto.hidetoshi@...fujitsu.com>,
	Ingo Molnar <mingo@...e.hu>,
	Roland McGrath <roland@...hat.com>,
	Spencer Candland <spencer@...ehost.com>,
	Stanislaw Gruszka <sgruszka@...hat.com>,
	linux-kernel@...r.kernel.org
Subject: Re: [RFC,PATCH 2/2] cputimers/proc:
	do_task_stat()->thread_group_times() is racy and O(n) under
	->siglock

On 03/24, Peter Zijlstra wrote:
>
> On Wed, 2010-03-24 at 21:45 +0100, Oleg Nesterov wrote:
> > Nowadays ->siglock is overloaded, it would be really nice to change
> > do_task_stat() to walk through the list of threads lockless. And note
> > that we are doing while_each_thread() twice!
> >
> > while_each_thread() is rcu-safe, but thread_group_times() also needs
> > ->siglock to serialize the modifications of signal_struct->prev_Xtime
> > members.

First of all, let me reply to myself. I see that I wasn't clear at all.

This patch does the first step to remove one reason for ->siglock
(modification of ->prev_Xtime). But this is very minor, I guess we
could change thread_group_times() to take signal->cputimer->lock.

The goal was to call thread_group_cputime() lockless under rcu lock
(either directly, or via thread_group_times(), this doesn't matter)
to avoid while_each_thread() under ->siglock.

And in this case /proc/pid/stat can't report utime/stime atomically.
Whatever we do we can race with exit, so it doesn't make sense to
play with ->prev_Xtime.

> Right, so from what I remember the issue is that, yes top et al rely on
> that monotonicity,

Really?  So, do you think the change above will break user-space?

How sad :/

> but more importantly I think
> clock_gettime(CLOCK_PROCESS_CPUTIME_ID) should indeed use ->siglock to
> ensure it serializes against do_exit() so that either we iterate the
> thread or get the accumulated runtime from signal_struct but not both
> (or neither).

Oh. I forgot everything I knew about posix-cpu-timers... But, it seems,
posix_cpu_clock_get() calls thread_group_cputime() under tasklist and
thus can't race with exit.

Oleg.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/