[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <4F74C0B2.1010100@zytor.com>
Date: Thu, 29 Mar 2012 13:06:10 -0700
From: "H. Peter Anvin" <hpa@...or.com>
To: Vaibhav Nagarnaik <vnagarnaik@...gle.com>
CC: Ingo Molnar <mingo@...nel.org>,
Steven Rostedt <rostedt@...dmis.org>,
Frederic Weisbecker <fweisbec@...il.com>,
Thomas Gleixner <tglx@...utronix.de>,
Ingo Molnar <mingo@...hat.com>,
David Sharp <dhsharp@...gle.com>,
Justin Teravest <teravest@...gle.com>,
Laurent Chavey <chavey@...gle.com>,
Michael Davidson <md@...gle.com>, x86@...nel.org,
linux-kernel@...r.kernel.org
Subject: Re: [PATCH 4/6] trace: trace syscall in its handler not from ptrace
handler
On 03/29/2012 12:43 PM, Vaibhav Nagarnaik wrote:
>
> However, we agree that the syscall tracing as implemented currently is
> a bit unwieldy. We would want to be a part of the re-designing effort
> if there is a momentum in the community towards that goal. We would be
> happy to contribute towards this effort.
>
I had a long discussion with Frederic over IRC earlier today. We came
up with the following strawman:
1. A system call thunk (which could be enabled/disabled by patching the
syscall table.) This provides an entry and exit hook, and also sets a
per-thread flag to capture userspace traffic.
2. Instrumenting get_user/put_user/copy_from_user/copy_to_user to
capture traffic to userspace. This captures the *full* set of system
call arguments, including things addressed via pointers. Furthermore,
it captures the exact versions fed to or returned from the kernel, and
deals with data-dependent collection like ioctl().
This has to be done with extreme care to avoid introducing overhead in
the no-tracing case, however, as these functions are extraordinarily
performance sensitive. This probably will require careful patching in
the first enable/last disable case.
3. There will need to be userspace tools written to decode the resulting
trace buffer. This is pretty much needed anyway, but once you throw in
complex data structures it becomes even more so. A trace will basically
consist of:
SYSCALL_ENTRY <syscall number> <arg1..6>
COPY_FROM_USER <address> <data>
...
COPY_TO_USER <address> <data>
...
SYSCALL_EXIT <return value>
Outputting this in human-readable format requires some reasonably
sophisticated logic, but the *HUGE* advantage is that not only is all
the information there, it is *correct by construction*.
-hpa
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists