[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20101215105411.0bbc8629@suzukikp>
Date: Wed, 15 Dec 2010 10:54:11 +0530
From: "Suzuki K. Poulose" <suzuki@...ibm.com>
To: KAMEZAWA Hiroyuki <kamezawa.hiroyu@...fujitsu.com>
Cc: linux-kernel@...r.kernel.org,
Jeremy Fitzhardinge <jeremy.fitzhardinge@...rix.com>,
Christoph Hellwig <hch@....de>,
Masami Hiramatsu <mhiramat@...hat.com>,
Ananth N Mavinakayanahalli <ananth@...ibm.com>,
Daisuke HATAYAMA <d.hatayama@...fujitsu.com>,
Andi Kleen <andi@...stfloor.org>,
Roland McGrath <roland@...hat.com>,
Amerigo Wang <amwang@...hat.com>,
Linus Torvalds <torvalds@...ux-foundation.org>,
KOSAKI Motohiro <kosaki.motohiro@...fujitsu.com>,
Oleg Nesterov <oleg@...hat.com>,
Andrew Morton <akpm@...ux-foundation.org>
Subject: Re: [RFC] [Patch 0/21] Non disruptive application core dump
infrastructure
On Wed, 15 Dec 2010 10:04:37 +0900
KAMEZAWA Hiroyuki <kamezawa.hiroyu@...fujitsu.com> wrote:
> On Tue, 14 Dec 2010 15:22:59 +0530
> "Suzuki K. Poulose" <suzuki@...ibm.com> wrote:
>
> > Hi all,
> >
> > This is series of patches implementing an infrastructure for capturing the core
> > of an application without disrupting its process semantics.
> >
> > The infrastructure makes use of the freezer subsystem in kernel to freeze the
> > threads and then collect the information to generate the core.
> >
> > The interface is provided by a /proc/pid/core file, reading which can give the
> > ELF formatted core of the process with "pid". The interface supports "seek"
> > operation on the fd, allowing the dumper to have control on the data that is
> > being dumped. Also it allows the user to store the dump at any location.
> >
> > The current implementation supports both native as well as the compat ELF
> > tasks.
> >
> > An open() call to the /proc/pid/core will try to freeze the threads in the
> > process and the read() requests will dynamically generate the contents for the
> > core file. The ELF header & Program Headers are stored in a kernel buffer to
> > allow us to map the fpos to the required data section.
> >
> > In case a thread is not frozen within a time interval, after issuing the freeze
> > request, we fill the register state information with 0's to indicate we could
> > not capture the data.
> >
> > A close() would kick the threads out of the refrigerator().
> >
> >
> > The implementation reuses some of the existing ELF core generation code by
> > exporting them. Some of the code common to both native and compat ELF class
> > support has been moved to a common place, elfcore-common.c. Also some of the
> > reusable functions, specific to the ELF class handling, has been made global,
> > after renaming the compat version of the same.
> >
> > We also added a new API -elf_core_copy_extra_phdrs() -for "reading" the arch
> > specific program headers, versus the existing elf_core_write_extra_phdrs().
> >
> > Patches 1 to 9 deals with re-arranging the ELF code to be reusable by the
> > infrastructure.
> >
> > Patches 10 to 21 implements the infrastructure.
> >
> > TODO: Add support for collecting the arch specific notes, currently used only
> > by Cell platform.
> >
> > Please let me know your review comments / thoughts.
> >
>
> Your purpose of this patch is to debug an application without attaching to gdb
> or take coredump by gcore ?
The purpose is to take the coredump in a more reliable way without affecting
the process semantics.
>
> IIUC, "freeze" is a bit dangerous because no one can ends the application while
> it's freezed and there is no information "it's frozen" via usaual user commands
> as 'ps' or 'top'.
>
> Can you add a new freeze state where the application can get SIGKILL,
> at least ? and show task's state as "frozen" in some way ? as
> task_state_array[] shows it in /proc/<pid>/status
I will investigate this approach.
Thanks
Suzuki
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists