lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20100324075815.GC26762@linux.vnet.ibm.com>
Date:	Wed, 24 Mar 2010 13:28:15 +0530
From:	Srikar Dronamraju <srikar@...ux.vnet.ibm.com>
To:	Peter Zijlstra <peterz@...radead.org>
Cc:	Ingo Molnar <mingo@...e.hu>,
	Andrew Morton <akpm@...ux-foundation.org>,
	Linus Torvalds <torvalds@...ux-foundation.org>,
	Masami Hiramatsu <mhiramat@...hat.com>,
	Mel Gorman <mel@....ul.ie>,
	Ananth N Mavinakayanahalli <ananth@...ibm.com>,
	Jim Keniston <jkenisto@...ux.vnet.ibm.com>,
	Frederic Weisbecker <fweisbec@...il.com>,
	"Frank Ch. Eigler" <fche@...hat.com>,
	LKML <linux-kernel@...r.kernel.org>,
	Roland McGrath <roland@...hat.com>,
	Oleg Nesterov <oleg@...hat.com>,
	Christoph Hellwig <hch@...radead.org>,
	Ulrich Drepper <drepper@...hat.com>,
	Tom Tromey <tromey@...hat.com>
Subject: Re: [PATCH v1 7/10] Uprobes Implementation

Hi Peter, 

> > > I would still prefer to see something like:
> > > 
> > >  vma:offset, instead of tid:vaddr
> > >  
> > > You want to probe a symbol in a DSO, filtering per-task comes after that
> > > if desired.
> > > 
> 
> > do you mean the user should be specifying 357c200000:74b80 to denote
> > 000000357c274b80? or /lib64/libc.so.6:74b80
> > And we trace all the process which have mapped this address?
> 
> Well userspace would simply specify something like: /lib/libc.so:malloc,
> we'd probably communicate that to the kernel using a filedesc and
> offset.
> 
> And yes, all processes that share that DSO, consumers can install
> filters.
> 

I think perf would be using uprobes in one of the four ways.
- Trace a particular process.
- Trace a particular session.
- Trace all instances of an executable. 
- Trace all programs in the system.

If we use global approach, filtering would still be part of the handler.
So even if we want to probe just one process, we would still take hit
for all processes that map the DSO and hit that vaddr.
Other process could be hitting the probepoint more often while the
probed process could rarely be hitting the probepoint. This could
place significant overhead on the system.

Also with KSM, the page we are probing could be part of the stable tree
and mapped by different virtual machines. Can this lead to interruptting
work on an unrelated virtual machine? If yes, Is it okay to interrupt an
unrelated VM? If not, what measures need to be taken?

Currently perf can be used by priviledged users. However when perf gets
to trace user space programs, would it still be limited to priviledged
users. Do we have plans to allow users to trace their owned
applications thro perf?

> > > 
> > > This should allow the handler to optimistically access memory from the
> > > trap handler, but in case it does need to fault pages in we'll call it
> > > from task context.
> > 
> > Okay but what if the handler is coded to sleep.
> 
> Don't do that ;-)
> 
> What reason would you have to sleep from a int3 anyway? You want to log
> bits and get on with life, right? The only interesting case is faulting
> when some memory references you want are not currently available, and
> that can be done as suggested.
> 

Though one of the usp of uprobes is non disruptive tracing, applications
like debuggers who do disruptive tracing can benefit from uprobes. 

Debuggers could use uprobes as a feature to implement inserting/removing
breakpoints and get the out of line single-stepping. In an earlier
discussion http://lkml.org/lkml/2010/1/26/344 Tom Tromey did say that if
a facility was given, it could be used in gdb.

What I expect is the tracee to inform the tracer that it has hit the
breakpoint and "wait" for the tracer to give indication to continue.

Benefits could be 
- Debuggers can benefit from execution out of line and can debug
  multithread processes much better. 

- Two debbugers/tracers could trace the same process. One of the tracer
  could be strace, while the other one could be gdb.

- perf and debugger could be interested in the same vaddr for that
process and still continue to work. 
Lets say debugger and perf are interested in a particular function for
example malloc.
If perf uses uprobes and debuggers uses existing methods, then perf
measures of malloc may not be accurate as it misses those mallocs of the
process that's being debugged. However I agree that its a very very very
minute case.


> > > Everybody else simply places callbacks in kernel/fork.c and
> > > kernel/exit.c, but as it is I don't think you want per-task state like
> > > this.
> > > 
> > > One thing I would like to see is a slot per task, that has a number of
> > > advantages over the current patch-set in that it doesn't have one page
> > > limit in number of probe sites, nor do you need to insert vmas into each
> > > and every address space that happens to have your DSO mapped.
> > > 
> > 
> > where are the per task slots stored?
> > or Are you looking at a XOL vma area per DSO?
> 
> The per task slot (note the singular, each task needs only ever have a
> single slot since a task can only ever hit one trap at a time) would
> live in the task TLS or task stack.
> 

Do we need a buy-in from glibc folks to do this?
Also here is what Roland had once said about TLS.

"Next we come to the problem of where to store copied instructions for
stepping.  The idea of stealing a stack page for this is a non-starter.
For both security and robustness, it's never acceptable to introduce a
user mapping that is both writable and executable, even temporarily.  We
need to use an otherwise unused page in the address space, that will be
read/execute only for the user, we can write to it only from kernel
mode."


--
Thanks and Regards
Srikar
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ