[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <1204932363.8797.9.camel@bobble.smo.corp.google.com>
Date: Fri, 07 Mar 2008 15:26:03 -0800
From: Frank Mayhar <fmayhar@...gle.com>
To: Roland McGrath <roland@...hat.com>
Cc: parag.warudkar@...il.com,
Alejandro Riveira Fernández
<ariveira@...il.com>, Andrew Morton <akpm@...ux-foundation.org>,
linux-kernel@...r.kernel.org, Ingo Molnar <mingo@...e.hu>,
Thomas Gleixner <tglx@...utronix.de>,
Jakub Jelinek <jakub@...hat.com>
Subject: Re: [Bugme-new] [Bug 9906] New: Weird hang with NPTL and SIGPROF.
Based on Roland's comments and from reading the source, I have a
possible fix. I'm posting the attached patch _not_ for submission but
_only_ for comment. For one thing it's based on 2.6.18.5 and for
another it hasn't had much testing yet. I wanted to get it out here for
comment, though, in case anyone can see where I might have gone wrong.
Comments, criticism and (especially!) testing enthusiastically
requested.
>>From my notes, this patch:
Replaces the utime, stime and sched_time fields in signal_struct with
shared_utime, shared_stime and shared_schedtime, respectively. It
also adds it_sched_expires to the signal struct.
Each place that loops through all threads in a thread group to sum
task->utime and/or task->stime now loads the value from
task->signal->shared_[us]time. This includes compat_sys_times(),
do_task_stat(), do_getitimer(), sys_times() and k_getrusage().
Certain routines that used task->signal->[us]time now use the shared
fields instead, which may change their semantics slightly. These
include fill_prstatus() (in fs/binfmt_elf.c), do_task_stat() (in
fs/proc/array.c), wait_task_zombie() and do_notify_parent().
The shared fields are updated at each tick, in update_cpu_clock()
(shared_schedtime), account_user_time() (shared_utime) and
account_system_time() (shared_stime). Each of these functions updates
the task-private field followed by the shared version in the signal
structure if one is present. Note that if different threads of the
same process are being run by different CPUs at the tick, there may
be serious cache contention here.
Finally, kernel/posix-cpu-timers.c has changed quite dramatically.
First, run_posix_cpu_timers() decides whether a timer has expired by
consulting the it_*_expires and shared_* fields in the signal struct.
The check_process_timers() routine bases its computations on the new
shared fields, removing two loops through the threads. "Rebalancing"
is no longer required, the process_timer_rebalance() routine as
disappeared entirely and the arm_timer() routine merely fills
p->signal->it_*_expires from timer->it.cpu.expires.*. The
cpu_clock_sample_group_locked() loses its summing loops, consulting
the shared fields instead. Finally, set_process_cpu_timer() sets
tsk->signal->it_*_expires directly rather than calling the deleted
rebalance routine.
There are still a few open questions. In particular, it's possible
that cache contention on the tick update of the shared fields could
mean that the current scheme is not entirely sufficient. Further,
the semantics of the status-returning routines fill_prstatus(),
do_task_stat(), wait_task_zombie() and do_notify_parent() may no longer
follow standards. For that matter, ITIMER_PROF handling may be broken
entirely, although a brief test seems to show that it's working fine.
Stats:
fs/binfmt_elf.c | 18 +--
fs/proc/array.c | 6 -
include/linux/sched.h | 10 +-
kernel/exit.c | 13 --
kernel/fork.c | 25 +----
kernel/itimer.c | 18 ---
kernel/posix-cpu-timers.c | 224 ++++++++++------------------------------------
kernel/sched.c | 16 +++
kernel/signal.c | 6 -
kernel/sys.c | 17 ---
10 files changed, 105 insertions(+), 248 deletions(-)
--
Frank Mayhar <fmayhar@...gle.com>
Google, Inc.
View attachment "posix-timers.patch" of type "text/x-patch" (21998 bytes)
Powered by blists - more mailing lists