[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <c62985530905090801x1a2c34der364d4636ec667429@mail.gmail.com>
Date: Sat, 9 May 2009 17:01:44 +0200
From: Frédéric Weisbecker <fweisbec@...il.com>
To: Mathieu Desnoyers <mathieu.desnoyers@...ymtl.ca>
Cc: Ingo Molnar <mingo@...e.hu>, Jason Baron <jbaron@...hat.com>,
Tom Zanussi <tzanussi@...il.com>, linux-kernel@...r.kernel.org,
laijs@...fujitsu.com, rostedt@...dmis.org, peterz@...radead.org,
jiayingz@...gle.com, mbligh@...gle.com, roland@...hat.com,
fche@...hat.com
Subject: Re: [RFC] convert ftrace syscall tracer to TRACE_EVENT()
2009/5/9 Mathieu Desnoyers <mathieu.desnoyers@...ymtl.ca>:
> * Ingo Molnar (mingo@...e.hu) wrote:
>>
>> * Frédéric Weisbecker <fweisbec@...il.com> wrote:
>>
>> > > I would expect to use copy_string_from_user (for strings) and
>> > > copy_from_user for structures, because without any strings
>> > > (especially), the trace information become much less useful.
>> >
>> > Yeah, for structures we would just need the copy_from_user.
>>
>> There's just a few places (mainly related to VFS APIs) where we
>> really want to do that, and there we want to do it a bit later, not
>> at syscall time: we want to do it after the getname(), to output a
>> stable (and already copied to kernel space) copy of the file name.
>>
>> So the right solution there would be to add special, case by case
>> tracepoints to those few places. We dont need strings for the
>> majority of the 300+ system calls that exist on Linux.
>>
>> Ingo
>
> Hrm, this is an important design decision.. I cover a lot of those sites
> in my LTTng instrumentation, and this is clearly one way to do it, at
> the expense of adding tracepoints in many kernel locations when there
> could be a functionnal equivalent with syscall instrumentation.
Yeah, these tracepoints defined from DEFINE_SYSCALL are a good way
to proceed generically.
For specific cases, we can later add some upper layer, such as described below.
> The thing we would need to do it from the syscall tracing site is a
> table to map the system call numbers to their specific types (for the
> syscalls we care about) and therefore which would also map to a
> serialisation function to extract the parameters and write the correct
> content into the trace buffers.
I would rather see this not using the syscalls as a key but the type
of a parameter.
We can find a same specific complex type used by several syscalls.
If we want even better precision, we can also pair that with syscalls
mapping for specific post-computing in output time. As an exemple to
print O_RDONLY instead of the matching number.
>
> We could also use getname()/putname() in the syscall tracing primitive.
> Note that architectures like x86 64 needs some tweaks I have in my
> patchset to correctly ensure that syscall entry/exit are always paired.
> This is required because we change the thread flag synchronously with
> thread execution upen activation/deactivation.
Not sure I understand your point here. The only resulting problem of such
race would be rare unpaired syscall exit or entry traces... Is it that
much important?
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists