[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20100323122335.GB26762@linux.vnet.ibm.com>
Date: Tue, 23 Mar 2010 17:53:35 +0530
From: Srikar Dronamraju <srikar@...ux.vnet.ibm.com>
To: Peter Zijlstra <peterz@...radead.org>
Cc: Ingo Molnar <mingo@...e.hu>,
Andrew Morton <akpm@...ux-foundation.org>,
Linus Torvalds <torvalds@...ux-foundation.org>,
Masami Hiramatsu <mhiramat@...hat.com>,
Mel Gorman <mel@....ul.ie>,
Ananth N Mavinakayanahalli <ananth@...ibm.com>,
Jim Keniston <jkenisto@...ux.vnet.ibm.com>,
Frederic Weisbecker <fweisbec@...il.com>,
"Frank Ch. Eigler" <fche@...hat.com>,
LKML <linux-kernel@...r.kernel.org>,
Roland McGrath <roland@...hat.com>,
Oleg Nesterov <oleg@...hat.com>,
Christoph Hellwig <hch@...radead.org>
Subject: Re: [PATCH v1 7/10] Uprobes Implementation
Hi Peter,
> On Sat, 2010-03-20 at 19:56 +0530, Srikar Dronamraju wrote:
> > +struct uprobe {
> > + /*
> > + * The pid of the probed process. Currently, this can be the
> > + * thread ID (task->pid) of any active thread in the process.
> > + */
> > + pid_t pid;
> > +
> > + /* Location of the probepoint */
> > + unsigned long vaddr;
> > +
> > + /* Handler to run when the probepoint is hit */
> > + void (*handler)(struct uprobe*, struct pt_regs*);
> > +
> > + /* true if handler runs in interrupt context*/
> > + bool handler_in_interrupt;
> > +};
>
> I would still prefer to see something like:
>
> vma:offset, instead of tid:vaddr
>
> You want to probe a symbol in a DSO, filtering per-task comes after that
> if desired.
>
If I would want to trace malloc in a process
$ objdump -T /lib64/libc.so.6 | grep malloc
000000357c274b80 g DF .text 0000000000000224 GLIBC_2.2.5 __libc_malloc
000000357c271000 w DF .text 0000000000000273 GLIBC_2.2.5 malloc_stats
000000357c275570 w DF .text 00000000000001fb GLIBC_2.2.5 malloc_get_state
000000357c5514f8 w DO .data 0000000000000008 GLIBC_2.2.5 __malloc_hook
000000357c274b80 g DF .text 0000000000000224 GLIBC_2.2.5 malloc
000000357c26f570 w DF .text 0000000000000033 GLIBC_2.2.5 malloc_usable_size
000000357c271420 w DF .text 000000000000024e GLIBC_2.2.5 malloc_trim
000000357c5529a0 w DO .bss 0000000000000008 GLIBC_2.2.5 __malloc_initialize_hook
000000357c271670 w DF .text 00000000000003c2 GLIBC_2.2.5 malloc_set_state
$
$ cat /proc/9069/maps
...............
357c200000-357c34d000 r-xp 00000000 08:03 6115979 /lib64/libc-2.5.so
357c34d000-357c54d000 ---p 0014d000 08:03 6115979 /lib64/libc-2.5.so
357c54d000-357c551000 r--p 0014d000 08:03 6115979 /lib64/libc-2.5.so
357c551000-357c552000 rw-p 00151000 08:03 6115979 /lib64/libc-2.5.so
...............
$
do you mean the user should be specifying 357c200000:74b80 to denote
000000357c274b80? or /lib64/libc.so.6:74b80
And we trace all the process which have mapped this address?
> Also, like we discussed in person, I think we can do away with the
> handler_in_interrupt thing by letting the handler have an error return
> value and doing something like:
>
> do_int3:
>
> uprobe = find_probe_point(addr);
>
> pagefault_disable();
> err = uprobe->handler(uprobe, regs);
> pagefault_enable();
>
> if (err == -EFAULT) {
> /* set TIF flag and call the handler again from
> task context */
> }
>
> This should allow the handler to optimistically access memory from the
> trap handler, but in case it does need to fault pages in we'll call it
> from task context.
Okay but what if the handler is coded to sleep.
>
> Everybody else simply places callbacks in kernel/fork.c and
> kernel/exit.c, but as it is I don't think you want per-task state like
> this.
>
> One thing I would like to see is a slot per task, that has a number of
> advantages over the current patch-set in that it doesn't have one page
> limit in number of probe sites, nor do you need to insert vmas into each
> and every address space that happens to have your DSO mapped.
>
where are the per task slots stored?
or Are you looking at a XOL vma area per DSO?
> Also, I would simply kill the user_bkpt stuff and merge it into uprobes,
> we don't have a kernel_bkpt thing either, only kprobes.
>
We had uprobes as one single layer. However it was suggested that
breaking it up into two layers was useful because it would help code
reuse. Esp it was felt that a generic user_bkpt layer would be far more
useful than being used for just uprobes.
Here are links where these discussion happened.
http://sourceware.org/ml/systemtap/2007-q1/msg00570.html
http://sourceware.org/ml/systemtap/2007-q1/msg00571.html
--
Thanks and Regards
Srikar
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists